Technical SEO Mastery: Lessons from the GOAT, Wikipedia

Technical SEO Mastery: Lessons from the GOAT, Wikipedia

Article stats

Quick links

From a technical SEO perspective, Wikipedia is perhaps the greatest of all-time (GOAT). Each month, it receives billions (yes, billions!) of organic visitors across its 46M+ articles in 300 languages.

It’s trusted by millions, and the Wikimedia Foundation deserves credit for implementing quality assurance, such as Featured Articles, and the Good Articles nomination process, to kaizen the academic accuracy of its content.

Image credit: QuotesPop.com

Perfecting an open-source encyclopedia takes time, as does perfection of technical SEO for large enterprise sites. If not for robust technical SEO, search engine spiders can’t crawl, index, and rank your thousands or millions of pages properly.

While filthy-rich content and high site authority play a significant role in Wikipedia’s SEO dominance, I argue their technical SEO plays the most important role, allowing it to rank for almost every informational keyword at the top or on page 1.

This is a tribute post that extracts lessons and praises Wikipedia’s platform for its SEO mastery, which also powers Wikipedia’s sister Project sites, such as Wikidata, Wikinews, and Wiktionary.

We’ll study various technical SEO techniques that’ve helped Wikipedia rank at scale on page one on desktop, and on mobile:

  • Domain Setup / Internationalization
  • Sitewide Links / HTML Layout
  • Meta Descriptions
  • Page Templates / URL Formats
  • Site Architecture
  • Mobile-First Indexing
  • Page Speed
  • The Single Most Important Enterprise SEO Success Factor

Domain Setup / Internationalization

Wikipedia’s mission is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally (source).

Or in other words, to make information freely available to everyone.

The role of SEO on this mission: to help every single person in the world find and comprehend information on any subject in any spoken language. This is where a robust and scalable, international SEO strategy comes into play.

You can localize content in multiple languages using sub-domains and sub-directories on a single domain, or you can set up websites on multiple domains regionally using country code Top-level Domains, such as Wikipedia.in (India), Wikipedia.fr (France), and Wikipedia.gr (Greece) for example.

Between sub-domains, sub-directories, or ccTLDs, either of the 3 approaches can help you dominate international SEO if done correctly.

While Wikipedia sporadically uses ccTLDs, it primarily goes the sub-domain route to excel in ranking worldwide.

Wikipedia International SEO Strategy

The domain wikipedia.org is set up to support 300 desktop sites on sub-domains:

  • www.wikipedia.org (the global homepage)
  • En.wikipedia.org (English)
  • Fr.wikipedia.org (French)
  • And another 298 language sites

It’s also set up to support 300 mobile (m.) sites on sub-sub-domains, such as en.m.wikipedia.org.

The global home page www.wikipedia.org is the starting point. It invites bots and humans to browse its most popular Wikipedia sites by language.

Wikipedia the free encyclopedia

Or, you can browse its family of Project sites:

project sites

The endless river of link juice flows into Wikipedia and is evenly distributed to ~320 Main pages of Wiki sites. Here are these link metrics, via Ahrefs:

Wikipedia via Ahrefs

As you get into the Main page of each sub-domain, for example, the English site’s home page, you notice that Wikipedia continues linking to the equivalent Main page in foreign languages.

Every subsequent page, including the millions of Wiki articles, contains a link section where dynamically generated links point to the corresponding article in all other available languages.

Here’s the English page for The Office.

Wikipedia articles in different languages

Google suggests you annotate links to articles in foreign languages by using the hreflang and rel=”alternate” tags in the HTML link element in the header, in HTTP headers, or in Sitemaps.

Wikipedia may use their Sitemaps, but they certainly don’t use the headers to point to multilingual content.

The most obvious way Wikipedia annotates links is by marking up the actual anchor tag of each link. From the English page for “The Office,” here’s an example of a link to the Spanish version:

<a href="https://es.wikipedia.org/wiki/The_Office" title="The Office – Spanish" lang="es" hreflang="es" class="interlanguage-link-target">Español</a>

You can get a deeper understanding of multi-lingual SEO from this article: Multilingual SEO: Translation and Marketing Guide.

Sitewide Links & HTML Layout

Wikipedia codes their pages’ templates beautifully for spiders by prioritizing content and internal links in 2 ways:

  1. It doesn’t place any “important” links (i.e., to pages with a lot of SEO traffic potential) in the <footer> section.
  2. It places sidebar links towards the bottom of the source code in the <body> section.

But why are these two things beneficial from a technical perspective?

Well, Google has stated that both sidewide and footer links are not given much weight—that is, they pass less link-equity (or PageRank). This is likely because such links are deemed to be less important for users. After all, when did you last click a link in a website’s footer? I can’t remember the last time I did.

Thus, Wikipedia chooses to place less-important links in their footer section. I.e., links to pages that don’t need to rank for any competitive keywords, such as their privacy policy, “about” page, and so forth.

But Wikipedia has another trick up its sleeve…they add a kind-of additional “sub-footer” section to each page. This contains a bunch of dynamically-generated links related to the overall topic of the page.

Because this sub-footer is dynamically generated for each page, none of the links are sitewide. Therefore, they don’t get devalued by Google.

And it’s a similar story with the links in the left sidebar.

Most links in the sidebar are for editors and users (i.e., for navigational purposes). And the sidebar is sitewide, so it makes sense not to include any important links (in terms of SEO) in this section.

But again, Wikipedia goes one step further…

Their page is coded in such a way that the sidebar HTML is placed towards the bottom of the source code (it’s still in the <body> section, but right at the bottom). This allows contextual links at the top of the page to receive more link equity, while the sidebar links are further demoted.

Lesson: Place links to rank-worthy target pages at the top of each page contextually, and using rich anchors. Place second-tier links in sidebars below your contextual links in the page source.

Meta Descriptions

What meta descriptions? Wikipedia leaves them blank.

Wikipedia uses title tags in the header but ignores populating descriptions, which goes against all standard SEO advice. Not gonna lie, if I was consulting them I’d give them the same advice and plead my case, too:

You should have a keyword-rich description, with at least one call to action, to entice high click-through rates, which could indirectly lead to higher rankings. Meta descriptions are easy to implement across the entire site with a template like this:

Learn more about [Wiki topic], or join over a 100,000 contributors and add your own knowledge and expertise about [Wiki topic].”

To which I’d evoke the Michael Scott death glare:

Image credit: Reddit

I’d be wrong to give this advice. Creating a generic, one-size-fits-all description template or even encouraging contributors to update the meta description by hand doesn’t make sense for Wikipedia. Every Wiki article fits thousands of search queries.

Long-form Wiki articles like this one about the Sun, rank for over 4,900 keywords per Ahrefs Organic Keywords report. The best thing to do is leave the description blank, and let Google figure what snippets to display for any query.

sun Ahrefs

Every Wiki article starts with the topic in bold and a simple sentence structure, clearly answering the who, what, where, or when, just as a meta description would do anyway:
This formatting could also help Wikipedia potentially improve its chances for featured snippets.

Here’s the search result for “What is the Sun?”

sun Wikipedia

The Wiki page:

sun

Google does a decent job of extracting relevant descriptions for other related searches. Here’s the search result for “what type of star is the sun?”

what type of star is the sun

Search result for “age of the sun.”

age of the sun

Image Credit: Imgur

Page Templates & URL Formats

Wikipedia predominantly uses just one page template and URL format to rank pages — the Wiki article.

Wikis: https://en.wikipedia.org/wiki/[Topic]

99.95% of the top (100,000) organic landing pages are Wiki articles, per Ahrefs Top Landing Pages, residing in the /wiki/ sub-directory.

Even its sister sites use the same URL format and single page template to structure its core content. Examples:

  • Wikimedia Commons:
    https://commons.wikimedia.org/wiki/File:Jenna_Fischer5crop.jpg
  • Wiktionary:
    https://en.wiktionary.org/wiki/pallet/
  • Wikiquote:
    https://en.wikiquote.org/wiki/The_Office_(U.S._TV_series)

Site Architecture

If you dig into Wikipedia’s website architecture, here’s what you will find.

Secondary Navigation

While Wikipedia uses other page templates, such as Portals, Categories, Lists, as well as editor-friendly pages you see in the left sidebar, these pages exist for secondary navigation and general information. These pages rarely rank for any search terms.

For example, Portals are topic pages that exist as additional entry points from the home page, such as Geography. A Portal seemingly exists for editors to click into the topics they’re interested in contributing. For bots, it’s like an index sitemap welcoming search engines into the world of all the Wikis.

Wikis

Geography is both a Portal and a Wiki. Guess which one ranks better?

The Wiki!

As an SEO, you wonder “shouldn’t the Portal Geography page outrank the Wiki, as it’s one click away from the home page, the most authoritative page, and it contains good unique content?” The reasons are likely as follows.

While the Portals link to the Wikis, Wikis don’t typically link back to Portals. And all Geography Wikis link back to the Geography Wiki so overall, in terms of URL rating, the Wiki Geography page is stronger than the Portal Geography page.

In fact, the Geography Portal page URL Rating is 40 and it ranks for zero organic keywords.

But the Geography Wiki page URL Rating is 73. It ranks for 2.5K+ organic keywords.

Lesson: Both prominence and quantity of internal links determine which of 2 pages with similar content and the same target keyword outranks the other. Linking a page high up in the site hierarchy — even pages just 1 click away from the home page — doesn’t guarantee good rankings.

If Wikipedia pointed more links to the Portal Geography page from all relevant Wiki Articles, perhaps by using breadcrumbs, that page would likely beat the Wiki Geography page.

Primary Navigation

The primary way to navigate Wikipedia is to use its on-site search, or by clicking from Wiki to Wiki. Wikipedia’s contextual linking makes it easy for bots and users to browse the site. While the secondary navigation shown above uses a rather deep structure, the primary navigation comprises of a beautifully designed flat site architecture.

site architecture

The five front-and-center content sections feature and constantly rotate timely or random Wiki articles. These articles contextually link to other related Wiki articles, and so on. Wikipedia doesn’t use mega menus or faceted navigation, as it doesn’t use a top-down categorization structure. It’s only 2 levels deep.

Categorization

As an SEO, it’s perplexing to see that an encyclopedia with millions of articles that can easily follow a categorization structure like this, refuses to:

Image credit: Moz

You wonder how Google categorizes all this content and indexes it in neatly-ordered taxonomies. I mean, Wikipedia doesn’t use even use breadcrumbs, so how’s Google to create parent and child relationships of categories with articles?

Well, what Wikipedia teaches us is that child and parent relationships don’t matter if your contextual internal linking is super-relevant, abundant, and free of wasteful links (404s, duplicate/thin content pages, etc.).

Basically, Wikipedia treats every topic (categories, subcategories, and sub-sub-categories) as a Wiki article, and interlinks them all contextually.

In comparison, Encyclopedia.com’s top-down structure requires 4 clicks from the home page to get to an article. So they have to turn to faceted navigation and breadcrumbs to help reinforce the parent-child categorization.

Internal Link Proximity

The Six Degrees of Separation is the idea that any person is connected to any other person on the planet by no more than 5 intermediary acquaintances.

Likewise, on Wikipedia, it takes on average only 4.5 clicks to get from a Wiki article to any other Wiki article.

Image Credit: DataPub CDLIB

One of Wikipedia’s greatest software functions is the ability for editors to easily cross-link to Wikis. Within the body of each article, you’ll notice that editors tend to hyperlink almost all concepts or subjects to the matching Wiki article. If you’re not using menus and breadcrumbs, this is the only way possible to establish strong link relationships across a site with millions of pages without using automatic linking software.

Internal & External Link Counts per Page

Moz, for example, suggests you keep your links at roughly 150 links per page. Matt Cutts suggests keeping them at 100 so that you don’t overwhelm users with a poor experience. It’s widely believed, and for a good reason, that excess links on a page hurt PageRank distribution, and don’t do users any good. Most sites should stick to the 150 or below threshold.

Wikipedia wishes it could, but can’t.

The US page for The Office contains over 2,300 links.

  • ~100 sidebar links for editors and users to pages containing little to no value in ranking for non-brand terms.
  • ~225 (10%) external links and citations — the infamous ‘nofollow’ links SEOs love debating over.
  • ~150 links to foreign language versions of the Wiki article.

These make up roughly 20% of all the links on the page living at the bottom of the lower priority section in the HTML.

generic navigational links

The remaining 1,800 (80% of all) links are jump links and contextual links with rich anchors, prioritized in the top of the section in the HTML.

contextual links

Rand Fishkin suggests that Google weighs links higher in the HTML with more weight than those lower in the HTML. I still believe this works today as an evergreen internal linking tactic.

Search Engine Crawls

Question: When Googlebot starts crawling Wikipedia, does it ever finish?

Googlebot web crawls are determined by some combination of so-called domain authority and individual page authority (PageRank), frequency and prominence of internal links, URL prioritization (via Sitemaps for example), and content updates. Considering Wikipedia nails all of these factors, what does a typical Wikipedia site crawl look like?

While it might delight any SEO professional to take sneak peek at Wikipedia’s server logs or its Webmaster Tools Crawl Stats, it’s not publicly available information. What is publicly available is this little-known traffic statistics tool by WFMLabs where you can see all kinds of interesting pageview stats at a page level or at a Project level. You can even see search engine spider crawl activity by Project dating back to July 2015:

siteviews analysis

While not specific to Googlebot, all search engine crawlers average over 40MM pageviews per day. In comparison, humans average over 250MM pageviews per day.

How does this compare to the number of pages search engines crawl on your site? If you’re not already doing so, check your Webmaster Tools Crawl Stats, and for deeper analysis, try to regularly review your server logs. For most sites, you can use a tool such as the Screaming Frog Log Analyzer.

Mobile First Indexing

It’s only a matter of time until Google rolls out mobile-first indexing in early-mid 2018, and SEO blogs and forums hit the panic button.

Wikipedia will ignore the chatter. They’ve been ready for mobile-first.

Look at the mobile version of the Office page, and notice the similarity with its desktop content. The main article content is the only content that appears on the mobile site. The only sidebar links that appear on mobile point to the equivalent article in other languages, and again, are pushed to the bottom of the HTML. All the other sidebar bloat-causing links are removed altogether.

the office page

Needless to say, when Google rolls out algorithms based on mobile-first indexation, Wikipedia’s ready to keep on ranking. Worth noting, it looks like Wikipedia has also shied away from hopping on the AMP bandwagon.

Page Speed

The mobile page for the Office in the screenshot above scores an 89 on mobile and 95 on desktop experiences. Wikipedia uses the m. mobile approach, as opposed to a responsive or adaptive approach to show mobile content, and it doesn’t redirect desktop users when they request the mobile page.

When requesting a desktop page from mobile, the desktop page does redirect to the m. site, accordingly.

pagespeed insights

For good measure, let’s score the same URL on Pingdom, and GTMetrix.

URL on Pingdom and GTMetrix

GTMetrix

All 3 tools find that Wikipedia can improve page load by leveraging browser caching, optimizing images, and combining JS and CSS files so that content above-the-fold loads quickly.

Here’s the thing about page speed testing regarding SEO: very few sites get it perfect because most web teams don’t try to nail every tiny little recommendation. Page speed tools want you to optimize every single thing that loads visibly or loads invisibly in the background of a page.

Even if you optimize all of your own assets — your servers have fast response times, you use CDN to deliver static files, you show clean, efficient HTML — you still have to solve for third-party applications, plugins, HTML, Javascript, CSS, images, etc. that make up the remainder of the page.

Setting expiration dates on all available cacheable resources, compressing every single image, in-lining all above-the-fold CSS, JavaScript, and HTML, are just a few tasks for the webmaster to score above 90 using any of these page speed test tools.

Even then, there’s no guarantee of excellence because standard HTTP 1.0/1.x can only request a few files at a time. The solution for this lies in HTTP2, which allows for multiplexing where a browser or a spider can request multiple files at a time in parallel.

Wikipedia has adopted HTTP2 to improve page speeds for users today, and while Google today still hasn’t enabled HTTP2 for Googlebot crawls, they still recommend you implement it.

The Most Important Technical SEO Success Factor

The Platform onto which a small, medium, or enterprise website is built upon, ultimately determines your SEO ability and scalability. The platform (AKA the framework) is comprised of everything — the servers, the software/code (Mediawiki software and PHP), the databases, the content, the design — that powers a (family of) website(s).

If you understand these building blocks, and how they specifically create your overall site experience, you realize the true limits and true possibilities of technical SEO. You can experiment and infuse different SEO techniques as your platform allows to design a harmonious search engine and user experience.

The Wikimedia Foundation wins enterprise SEO with their platform, where most enterprise organizations struggle due to archaic infrastructure, internal politics, and inefficiency.

If Wikipedia executed like most large enterprises, it’s SEO technology wouldn’t be as powerful as it is today, and without SEO domination, who knows if Wikipedia would exist as the household brand it is today?

The platform also enables Wikipedia’s dozen sister Project sites, such as Wikidata, Wikinews, and Wiktionary so they can piggyback and position themselves to dominate web search too.

Conclusion

Wikipedia is far from perfect, as an SEO platform, and as the world’s most accurate encyclopedia. Like Youtube, Reddit, and Twitter, it too has systemic biases that challenge it from truly becoming the deepest, truest, and richest source of knowledge.

Hopefully, Wikipedia’s founders keep working on it.

Whether you agree or disagree with my assessments of their technical SEO foundation, I believe SEO professionals of all levels can greatly benefit from observing how large websites like Wikipedia structure their code and content at a page level and at a site level.

What Wikipedia tactics have you tested, and what results have you seen? What tactics are you hoping to apply? Which of Wikimedia’s sister projects do you predict to be most successful?

Article stats

0
Like
Save

Comments

Tiffiny says:

I’ll immediately snatch your rss as I can’t in finding your e-mail subscription link or e-newsletter service. Do you have any? Please permit
me recognize in order that I could subscribe. Thanks.

Dianne says:

Excellent site you have here but I was curious about if you knew of any forums that cover the same topics discussed in this article? I’d really like to be a part of group where I can get opinions from other experienced people that share the same interest. If you have any recommendations, please let me know. Kudos!

You might find what you’re looking for overhere : https://wpbuffs.com/seo-forums/ . Be good … And, … or simply visit : https:seoapps.eu

FERNANDO says:

SEO and Digital marketing is a very rapidy changing. I just passed my Google Adwords exams and now I have more job offers and clients requests. Email me if anybody want to be Certified Digital Marketing Professional. You will pass all exams within few days

Carrol says:

I think that everything wrote made a lot of sense.

But, consider this, suppose you added a little information? I mean,
I don’t want to tell you how to run your blog, however what if you added a
post title that makes people desire more? I mean Technical SEO Mastery: Lessons from the GOAT, Wikipedia – Backlinks Nerwork is kinda boring.
You ought to look at Yahoo’s home page and watch how they write post headlines to grab viewers interested.

You might add a video or a picture or two to get readers interested about
what you’ve got to say. In my opinion, it would make your blog a little bit more interesting.

Thanks … I’m experimenting with some software. Again … thanks for your attention.

Elias says:

At this moment I am ready to do my breakfast, once having
my breakfast coming again to read additional news.

Hello, Neat post. There is an issue along with your web site in web explorer, would test this… IE still is the marketplace chief and a huge component of other people will omit your magnificent writing because of this problem.

At this time it looks like WordPress is the best blogging platform available right now. (from what I’ve read) Is that what you’re using on your blog?

I added a new list. As you’ll see it’s bigger than most of them. I hope you all have had a great week!

Hi! I know this is kinda off topic however I’d figured I’d ask. Would you be interested in trading links or maybe guest writing a blog article or vice-versa? My site addresses a lot of the same subjects as yours and I think we could greatly benefit from each other. If you’re interested feel free to send me an e-mail. I look forward to hearing from you! Superb blog by the way!

Cortney says:

Hello to every one, for the reason that I am in fact eager of
reading this blog’s post to be updated on a regular
basis. It carries good information.

Terrie says:

Hi to all, since I am in fact eager of reading this blog’s post to be updated on a regular basis.
It includes fastidious stuff.

Dexter says:

An outstanding share! I have just forwarded this onto a friend who was conducting a little research on this.
And he in fact ordered me dinner due to the fact that I found it for him…
lol. So allow me to reword this…. Thank YOU for
the meal!! But yeah, thanx for spending time to discuss this topic here on your site.

Glen says:

I know this web site presents quality dependent articles or reviews and other information, is there any other website which offers these kinds of stuff in quality?

Prince says:

You could definitely see your expertise within the work you write.
The sector hopes for more passionate writers such
as you who are not afraid to mention how they believe. All
the time follow your heart.

Meagan says:

I was very pleased to uncover this website. I wanted to thank you for your time
for this wonderful read!! I definitely loved every bit of it
and i also have you book-marked to check out new things on your web site.

Loren says:

I do accept as true with all the ideas you’ve presented on your post.
They’re really convincing and will certainly work.

Still, the posts are too brief for beginners. May you please prolong them
a little from next time? Thanks for the post.

Cleo says:

Hmm is anyone else having problems with the images on this
blog loading? I’m trying to figure out if its a problem on my end or if it’s the blog.

Any suggestions would be greatly appreciated.

Jeramy says:

Hello, all is going well here and ofcourse every one is sharing facts,
that’s actually fine, keep up writing.

Heriberto says:

Does your website have a contact page? I’m having problems locating it but,
I’d like to shoot you an email. I’ve got some creative ideas for your blog you
might be interested in hearing. Either way, great
blog and I look forward to seeing it improve over time.

Milagro says:

Hello there, just became alert to your blog through Google, and found that it’s truly
informative. I’m gonna watch out for brussels.
I’ll be grateful if you continue this in future. Many people will be benefited from your writing.
Cheers!

Darwin says:

I am extremely inspired with your writing talents as smartly as with the layout in your blog.
Is that this a paid subject or did you customize
it yourself? Anyway stay up the excellent high quality writing,
it is uncommon to see a great weblog like this one nowadays..

Bridget says:

I am regular reader, how are you everybody? This piece of writing
posted at this web page is truly nice.

King says:

Hmm it appears like your site ate my first comment (it was extremely long)
so I guess I’ll just sum it up what I wrote and say,
I’m thoroughly enjoying your blog. I too am an aspiring blog writer but I’m
still new to the whole thing. Do you have any helpful hints for inexperienced blog writers?

I’d really appreciate it.

Howard says:

I really like your blog.. very nice colors
& theme. Did you design this website yourself or did you hire someone to do it for you?

Plz reply as I’m looking to create my own blog and would like to find out where u got this from.
thanks

Danelle says:

Hey would you mind letting me know which web host
you’re using? I’ve loaded your blog in 3 different browsers and I must say this blog
loads a lot faster then most. Can you recommend a good internet hosting provider at a honest price?
Kudos, I appreciate it!

Wana Website | Hosting – Cloud Website Hosting packages equipped with a number of Free Bonuses, .com domain for under ten euro as well as a number of ‘website accelerator tools’ which will make your websites extremely fast at a price you’ll enjoy. https:wana.website

Angelita says:

Right now it sounds like WordPress is the preferred blogging platform out there right now.

(from what I’ve read) Is that what you are using on your blog?

Ashlee says:

I go to see day-to-day a few sites and websites to read content, except this website presents quality based articles.

I’ve been having issues with my Windows hosting. It has set me back quite a bit while making the next list. This is the current list that I have. I should add another list in less than a week. I’ll let you all know when the next list is ready. Thank you for your patience.

Harlan says:

Remarkable things here. I am very happy to peer your post.
Thank you so much and I’m looking ahead to
touch you. Will you kindly drop me a e-mail?

Camilla says:

This website definitely has all of the information and facts I needed
about this subject and didn’t know who to ask.

Finn says:

It is really a nice and useful piece of information. I’m
satisfied that you just shared this useful info with us.
Please stay us up to date like this. Thanks for sharing.

Annabelle says:

Hmm is anyone else encountering problems with the images on this blog loading?
I’m trying to determine if its a problem on my end or if it’s the blog.

Any feedback would be greatly appreciated.

Rene says:

Attractive portion of content. I simply stumbled upon your site and in accession capital to say that I get in fact enjoyed account your blog posts.
Any way I’ll be subscribing to your feeds or even I success you
get entry to consistently rapidly.

Dewey says:

Hi, I do think this is an excellent website. I stumbledupon it 😉 I
may revisit yet again since I book-marked it. Money and freedom is the best way to change, may you be rich and continue to
help other people.

Chante says:

Appreciation to my father who stated to me on the topic
of this webpage, this webpage is really amazing.

Write a comment