Let me start off by saying that digital marketing for local and small businesses does not need to be difficult. Many of the companies I’ve worked with in the past see it as a daunting task that is going to take too much time, money and a team of people to do. Sometimes they are right, but a majority of the time they are just trying to do too much at once. Internet marketing as a whole encompasses a broad set of activities, but local and small businesses do not need to do all of them. They just need to find specific strategies that work for them and capitalize on them.
This blog will be most helpful for local and small businesses such as electricians, plumbers, heating & AC Services, house cleaning companies, restaurants, lawyers and other small business types.
My goal is that you get at least one takeaway that will help make a positive impact on your business's digital marketing efforts. If you still have questions, feel free to ask them in the comments section.
OK? Here we go...
Tip #1: Your Website - Keep it simple
Your website is the face of your company online. It’s a marketing brochure for your brand, services, what your company stands for and the kinds of clients you work with. It should contain all the most important things about your business. This makes it easy for someone to find the information they need before making the decision to contact you.
In regards to creating a website, there are plenty of great companies out there that will outline and build a site for you. Just make sure you do research before contacting them so you know what to expect. The best tip I have for building a site is to... keep it simple.
Keeping a website simple can be very good for user experience, website upkeep and management, and especially SEO (search engine optimization). Here is an example of a basic outline that will encompass everything a viewer should know about your company. This outline will still leave room for you to expand your site and create more content with SEO in mind.
Individual Service Pages (including pictures and/or video of past work - if applicable)
Testimonials / Success Stories / Reviews
Resources (if applicable)
Could be used as a “self-help” or “how-to” section for certain business types
One thing to keep in mind is the kind of imagery and copy on your site. A lot of companies decide to use stock imagery and fairly generic copy. I would recommend going another route because this is a big opportunity for local businesses to show that they are human and their company has personality. This is where you can beat big brands for new business. To break this down a bit:
For imagery, use images from real projects or services you’ve done in the past. Spend some time ideating what kind of imagery you want to show on your website. Then hire a photographer (yes definitely spend money on these) to get the pictures you desire for your site.
For the copy, hire a copywriter who has done work previously for businesses in your industry. Make sure you give plenty of input so they know exactly what you are looking for and they stay in-line with your brand voice. And be sure they are experienced in writing copy with SEO in mind -- that is very important. ;)
One local site that has done well in both categories is Johnson Roofing & Gutters (the name is just a coincidence and I have no affiliation with them).
Last but not least, make sure your website works on all screen sizes so people can view it on all devices (computer, tablet and smartphone). A responsive design or having a separate mobile-specific version is a must. Also, If you already have a website, I highly recommend going through the Technical SEO Audit Checklist for Human Beings to ensure it is healthy and will perform well in search engines like Google and Bing.
Tip #2: Reviews - Your customers are your biggest advocates
Reviews are VERY important for your companies reputation, especially online. Positive or negative reviews and how you handle them can be a customers deciding factor when it comes to their decision to contact you. This is why it is a good practice to respond to every review you can.
Similar to using real images of your work and developing copy that gives your site personality, replying to reviews shows people that you are human. Local and small businesses need to show they care about the customer. If someone gives you a positive review, say thank you. If another person gives you a negative review, respond like you would if someone gave you negative feedback in-person. This shows that you care about what people are saying about your company and you want to engage with your customers even after working with them.
Here is an example of a company that does a good job replying to reviews on Yelp.
I would like to highlight two things here:
This company replies to all review types. Even 5-star reviews get attention, which is important because it shows the customer you care about their feedback and their business is important to you. This is a tactic that can help you get a customer for life.
The response to a bad review doesn't just stop at an explanation. It goes one step further and asks the person to get in contact with the general manager (contact info included), so they can learn more about the experience. This is a good tactic to (hopefully) change someone's opinion after a not-so-great, or misinformed experience.
Along with the above, reviews can be very helpful for SEO and your company's ability to rank above the competition. Be sure to ask for reviews on site like Google, Yelp and Facebook so you have a diverse review profile on a number of websites. Once you receive reviews, monitor them on a weekly basis so you can reply to them and engage with past customers. A good resource to learn more about why reviews are important is MOZ’s resource on Local Reviews and Ratings.
Tip #3: Social Media - Own your channels and grow your audience
Businesses of any size should be using social media. Create a company page, update it with all necessary information and use it on a regular basis. Social media sites like Facebook, LinkedIn, Instagram and Twitter can be very helpful when marketing a company. Plus, they are great sites to promote your services and post original content such as blogs and video. Use them correctly and you can find and engage with your target market (aka customers) and grow your following. Social media signals are starting to have an effect on SEO, too.
Some websites may have already created a generic page for your business. It’s good to do a sweep of all the main local and small business profile sites and check to see which ones have done this. Once you’ve identified the sites with your profile you can simply request to take ownership of the page and update it with all the correct information. Here is a small list of sites to check:
Now that you’ve taken time to create and/or round up your company profiles, update them with the most accurate data and include photos if possible. Then identify the ones that will be the most effective and start using them to your advantage.
Tip #4: Email Marketing - Keep them short and sweet
Email marketing is one of the best ways to market your services and content to a quality audience. People who sign up for your emails are asking you to communicate with them. You can do this by highlighting specials and deals you may be offering, letting them know about new blog posts or resources on your site, or just sending a monthly update on what your company has been up to and what’s ahead. One easy tip to keep in mind is to keep them short and sweet.
When used correctly, email marketing can be one of your highest converting marketing channels. Keeping your emails shorter is a good way to keep people engaged, while still getting your message across. People can get overwhelmed when presented with too much information. If the goal is to get someone to click on a link, write an enticing description and present an appropriate call to action (CTA). If your business has a new deal or promotion, tell them exactly what it is and how to get it.
Example email opener from Flatstick Pub that covers that month’s events
One more thing to keep in mind is to not overdo it with email marketing. If a company sends too many emails people could become uninterested or annoyed. This can cause people to unsubscribe from your mailing list, which is counterproductive to your goals for doing email marketing in the first place.
Tip #5: Content Marketing - Find your audience and plant some seeds
While content marketing is not always the most ‘important’ online marketing task it can be very helpful for many reasons:
Building thought leadership by answering questions around the web
Growing your online footprint by being active on Forums and Q&A sites like Quora, Reddit, and industry-specific sites such as AVVO (lawyers and attorneys) and HOUZZ (all kinds of home and garden discussions)
Both organic and paid social promotion of services and original content
Link building by creating relationships with bloggers and online influencers
Content marketing for local and small businesses can definitely take time to get right. It’s one of those tasks that’s easier to do when you have some spare time to spend on it. You need to identify “where to play”, meaning you need to find the right websites that have the discussions and content for you to engage with. This is also a very good way to identify new blog and page topics for your website. If someone has a question that requires a longer answer, write about it and post the full answer on your site. Then, you can answer the person on the site you found the topic on and refer them to the full answer on your site. Chances are that same question has been asked elsewhere, and now you have the “answer” on your site that you can point them to.
A couple things to keep in mind before embarking on any of these tactics are to:
First, make sure you have specific goals. This could include obtaining more organic or referral traffic, growing your social media following, or getting more qualified leads. Each of these tips can help you meet your goals and succeed in the ever-changing digital space.
The second thing to remember stems from my note in tip #1 about websites. Keep it simple. Don’t overthink things, which can cause less action to happen. Don't be afraid to dive in and get familiar with each space. The sooner you start taking action, the sooner you will start finding digital marketing success for your local or small business.
Having spent a few years working for a solely-online retailer, I noticed a few trends in locations where duplicate content issues consistently popped up on our site and those of competitors. Knowing what to keep an eye on can save a lot of time and troubleshooting down the road – so here are some of the repeat offenders to look out for and how to fix them.
Problem Area: Across Categories
The most blatant form of duplicate content on an e-commerce site is at the category level. This happens if multiple categories target the exact same type of product (for example, “Men’s Boots” and “Boots for Men”). This can happen for a variety of well-intended reasons such as wanting to present the category within two different parent categories, trying to appeal to different markets, or even in the name of targeting keywords for SEO.
However, regardless of intent, if the pages are targeting the same topic, Google will not have a clear picture of which page takes priority and will likely not serve either of these pages to users above sites that offer clear structure.
There are a couple of courses of action that can be taken on this issue depending on whether you are working in more of a preventative or reactive capacity.
If you are lucky enough to be involved in a process before duplicate categories are added, education is the best strategy. Many merchandizers or product teams don’t realize that there is any issue with having multiple landing pages for a product type. Helping them understand what to avoid and why can help reduce the number of duplicates preemptively.
The reactive solution to this issue is to clean up the existing pages to more clearly indicate importance. To locate the most obviously problematic categories, start with a crawl and look for identical or highly similar H1s. Then, if possible, follow that up with a manual check through a list of all live categories, as some duplicate intent (such as with synonyms) would not be as straightforward to find via crawl data.
When searching for duplicate categories on a site, remember that Google also uses on-page content to interpret page intent. So product selection within categories also needs to be considered, as the bulk of the content on most category pages comes from the product titles. Some e-commerce CMSs have a view that allows you to see which products are dual categorized and where; this can be very helpful for locating this particular type of duplicate content.
Once you’ve found the categories, the best course of action is to 301 redirect all but one of the overlapping categories to the one that you have determined to be most valuable (via rankings/traffic/sales/user experience). Alternately, if removing pages isn’t possible (like if they are being used for marketing purposes), you could add noindex tags to the duplicate pages or have them canonical back to the primary page. I would recommend putting a noindex, nofollow tag on a page that is not linked to from the site, such as a stand-alone landing page for an email, and using canonical tags for pages that live within the site’s navigable structure.
For example, if I found these categories and determined that they all contain mostly the same products:
I would then identify what each page is for, where it is linked from, and determine the action to take based on those factors:
Problem Area: Filters or Filter Combinations
Even if categories are targeting different product types you can still end up with identical targeting on certain pages when filters are applied. Of course, this issue will only be a significant problem if the filtered pages have unique URLs and are indexable.
To give an example, the site below sells many types of throw pillows, some of which are specific to indoor or outdoor use, and some of which are multi-use. They have separate categories for “Decorative + Throw Pillows” and “Outdoor Pillows,” which are located in respective /outdoor/ and /decor-pillows/ sections of their site.
At this point these categories are fine. While there may be a bit of overlap in intent, users and Google can both understand how they are differentiated.
However, duplicate content becomes an issue once filters are applied to these categories. Because “Outdoor Pillows” is such a broad term, this category includes a type filter for “Throw Pillow”:
And since some of the “Decorative + Throw Pillows” are multi-use, that category features one for “Outdoor”:
So now we have two self-canonicalized unique URLs targeting the exact same type of item (and even returning the same specific products).
This example is relatively straightforward, but given the limitless possibilities that some sites have for filter combinations, duplicate content from filters can quickly get out of hand if unaddressed.
Much like the cross-category duplicate content issue, the preventative solution to this one lies largely with education and process. If merchandising teams know:
That duplicate content is a thing
That it’s a problem
What to look at before creating a filter
Then they can keep these issues from existing in the first place. It may also be helpful to keep lists of similar categories or commonly overlapping filters to check before adding new filters - On sites I’ve worked on in the past, we consistently saw issues in overlap between filters in indoor and outdoor furnishings as well as between furnishing categories for general consumer and commercial clients.
If these filtered duplicates already exist, the solution is to remove them and 301 redirect the URLs to the page that takes precedence. Depending on what the filter was, it may be better to redirect them to one or the other of the parent category pages. However, before you remove anything, be sure to take a peek at any existing rankings or organic traffic that a page may be getting so that you consider any existing value before making a final decision on which to axe.
Problem Area: Product Descriptions
Every product page needs text on it to explain features and product details and add an element of branding. To meet this need, many e-commerce sites will use an unedited description directly from the manufacturer.
While there are obvious reasons behind this practice, including the implied accuracy of the initial description and efficiency of onboarding processes, this is a big problem when it comes to differentiating your site in the SERPs.
Because the manufacturer sends the same information to all brands that it sells through, this leads to many sites having identical text on their product pages. Even big brands are guilty of this. For example, the product description for a specific bookcase is identical to:
If you are using identical content to compete with a more prominent brand in the SERPs, both Google and users are likely going to prioritize the site with more brand authority.
Another instance where duplicate product descriptions become an issue is when a brand expands the number of platforms they sell through. For example, if an independent company initially sells through their own site, but then decides to list their products on Amazon as well. If this company uses the same descriptions on both sites, it can result in the brand losing the top place in the SERP for their own product, as Amazon has such a high domain authority.
Unique content is the solution to this problem. High effort though it may be, there is no substitute. This can be from internal teams that have writing skills, or it can be outsourced - different solutions work better for different organizations. The important part is ensuring that the content you are creating is both high quality (proper grammar, no misspellings, etc.) and unique to your site.
Problem Area: Title Tags
Duplicate title tags are an extremely widespread problem across the web in general. When SEO teams say “we need title tags on every page,” often developer teams will find the quickest and easiest solution to this problem: rolling out a standard tag across all pages.
However, title tags are supposed to help both users and Google to understand what a page is about at a high level. And clearly, identical tags on every page help no one identify topics.
The solution that many come to on this issue is algorithmically generating title tags based on the page’s H1. If this is feasible, it can be a great way to achieve SEO goals efficiently. However, in most organizations this will require developer resources to accomplish, which can be a significant constraint.
However, if your CMS is configured in such a way that it will accept a bulk upload of meta-data values, this can even be accomplished without dev dependencies by using Excel. If you use Screaming Frog to pull a list of URLs and their corresponding H1s, you can create a template design that will integrate the H1 text.
For example, If I wanted to make title tags that read “Shop *Product Type* | Example Site” I could create the following layout in Excel:
Then, I would use the CONCATENATE function to automatically generate text for all the title tags by inserting the H1 after “Shop” and putting “| Example Site” at the end as shown below.
Apply the formula to the column, and you have a list of unique title tags and their associated URLs:
Problem Area: Blog Posts or Resource Section
Most e-commerce sites these days feature some inspirational or informational content in the form of a blog or resource section. While this can be a great asset, it can also be a bit of a minefield for duplicate content.
Poor quality outsourcing, internal content producers who don’t understand the significance of creative integrity, or even just individuals who don’t know how to cite or refer to a source appropriately can end up producing content that is duplicative of another website’s.
Even if you invest in quality unique content for your site, sometimes it can be “borrowed,” overly quoted, or just straight copied by other sites. And if the site that does this has higher domain authority, the content can be beaten by itself in the SERPs.
Depending on when you are coming into the content creation process, the first step could be an initial audit of existing content to see if any is duplicate. While much of this must be done more or less manually by doing a quick search for exact matches to sections of your content, there are some tools available that can speed the process up a bit (I like Copyscape). If you find content that is duplicate, assess the value of the topic to your site and the extent of duplication, then either remove or refresh the content piece.
If you are lucky enough to be in on the content creation process from the beginning, ensure that the writers know what duplicate content is, and both how and why it can be a detriment to the site.
While many of these recommendations sound relatively simple, I understand they can be much harder to execute in practice. But if you keep tabs on these specific areas, and work to educate teams and integrate SEO considerations into their processes, your site will be one step closer to organic success.
One particular part caught my eye and I thought I would repurpose (with permission) some of their research and data to illustrate a point that I found interesting.
The original report contains this chart:
I was immediately struck by the big gap in the organic search line - it is the channel that has by some margin the biggest number of e-commerce companies which simultaneously rely on the channel and do not test to improve and understand their performance in this area. (And this accepts at face-value the claim by 29% of respondents that they do test organic search which is high, from my experience).
Regular readers will be unsurprised to hear that I’m interested in this given the huge investment we have been making into making SEO testable and quantifiable. So having noticed this tidbit, I reworked that chart to order by the gap and got this:
We very often hear from our clients and customers that they are under significant pressure in the business to measure and justify organic SEO investments. If you find yourself repeatedly having conversations where your boss asks how are you measuring the value of these on-site SEO changes? Or do you know which of the investments we are making in on-site SEO are paying off?
(Or if you’re the boss and you don’t know the answers to these questions).
Then maybe it’s time to check out the latest thinking in SEO split-testing - drop us a line and we’ll be happy to show you how it works.
Who hasn’t heard the proverb “keep your friends close, and your enemies closer”? This phrase alludes well to a key aspect of good SEO strategies: knowing who your search competitors are.
Your search competition is made up of sites competing for the same search visibility as your own domain. Search visibility refers to how visible your website is in search engine results. You need to who you are fighting against in the search results battlefield (and understand what their strategies are) because that's how you can suss out where your SEO efforts will be best spent.
Even if you know who your traditional business competitors are, you need to bear in mind that if they are not competing for the same keywords as your sites, then they are not a “search” competitor.
So, how can you easily find your real online competitors? There are many ways to do this manually (using search engines), as well as free and paid tools to help automate this process. Here are six quick and easy tools to help you with your competitive research:
Prep work: Identify your keywords
Before getting started, you need to know what key terms your site is targeting (and ranking for). If you don’t have a seed list of keywords or know which ones you’re ranking for, I recommend using Google’s Keyword Planner, SEMrush or Moz Keyword Explorer to discover them. You can read this keyword research guide if you need further help on the process of how to find keyword for your business. Not all keywords and phrases are made equally, however: pay special attention to phrases with high and mid average search volume (this means a high or moderate number of searches for a particular keyword usually in a month) and an achievable difficulty score. Once you have established your list of key search terms, it’s time to find your competitors.
1. Google Search
This is the most manual process for finding your online competitors and is totally free / straightforward. Now that you have your seed list of top keywords, it’s time to search for these terms on Google. Which pages consistently rank in the top 10 positions? If you sell bespoke sofas, for example, and you search for this term, you will see who’s ranking in the top six spots:
Tip: The results you find will depend on your location and how Google personalised your search, however, you can construct your own Google search URLs to avoid personalised searches. For example:
q=example+query - this means you're searching for "example query".
pws=0 - this disables personalisation.
gl=gb - this means you're searching as if you're in the UK.
hl=en - this means you're searching as if your browser language is English.
If you want to take it one step further, you can analyse results’ digital relevance through observing their Domain Authority (DA). DA is a search engine ranking score developed by Moz that measures the predictive ranking strength of entire domains or subdomains. Learn how Domain Authority is calculated with this link.
You can view a website's DA by using MozBar, a free Chrome extension:
A Domain Authority score ranges from 1 to 100, with higher scores corresponding to a greater ability to rank. Pay special attention to those that have a higher score than yours, so in the next step, you can try to understand the reasons why they are outranking you.
Also, you may find competitors within the paid results, as shown below:
Keep in mind that these two websites with the label “Ad” in green you see above are not organic results. These brands are bidding to appear, in this case, for the keyword “bespoke sofas,” so they are not showing up as a result of their page’s relevance for this search term.
2. Google “Related:” search operator
Another way of using Google to find your online competition (which is less time-consuming than checking search results by manually typing in your target keywords) is by performing a search using the operator “related:” followed by your domain. It will help you identify websites that Google thinks are similar to yours and therefore they might be considered as your competitors.
You need to type “related:[your URL website]” in the search box. In this example, I searched for domains related to “argos.co.uk”:
Note that the "related:" operator may only work for certain industries, and is typically best for larger sites. That said, it’s quick and worth trying to spot any overlapping domains, and potentially identify ones you may have otherwise overlooked.
STAT is one of the ranking tools that we use every day at Distilled. In order to find who your search competitors are in STAT, you need first to set up keyword tracking. Plug in your domain and keyword seed list, then let STAT do its thing for about 24 hours (it takes at least a full day for ranking information to populate). Once the information is available, go under the “Competitive Landscape” tab - there you will find your top search competitors based on the keywords you’ve given STAT.
Within this same section you can track organic “share of voice” to see which domains are winning, which ones are losing and those that could become a potential threat:
Most Frequently in Top 10 in Google and Bing Example. Source
One of the most useful functionalities available in STAT is a keyword tagging tool, that allows you to group your keywords by specific types. If your company sells pet products, you may have a tag to group keywords targeting all variants of pet food searches as opposed to a tag grouping keywords targeting pet grooming searches.
Aside from tracking your domain’s performance across groups of keywords, you can analyze whether you have different competitors within each keyword segment. Using our pet store example, if one of your segments is pet food and another is pet grooming, you will probably find that competitors differ between these two categories.
4. SEMRush (paid)
SEMRush is a competitive research tool that provides keyword ranking and traffic data. You need to pay for a subscription for unlimited data. However, SEMRush does provide a “freemium” model that allows you to see some information in its free version.
To find out which websites SEMrush considers your competitors, enter your domain and scroll down to the “Main Organic Competitors” section.
Domain Analytics Overview Section on SEMrush.
SEMrush calculates your competitors based on the analysis of the number of keywords of each domain and the number of the domain’s common keywords. This means the more keywords you share with a website, the higher the competition level would be. Focus on the five or six competitors with the highest competition level.
Competitors Section on SEMrush.
5. Searchmetrics (paid)
Searchmetrics will also give you an overview of your current online presence, including some of your main competitors, for both organic and paid. To use this tool, you need to pay a monthly subscription and, as opposed to SEMrush, Search Metrics doesn’t provide any free data.
Go to the “SEO Research” tab and click on “Competitors”. One of the nice features this tool provides, different to SEMrush, is the competitor chart (below) where you can see in a graph how many keywords you share with your most related competitors. On the right side you will see your broad competitors, the ones you share fewer keywords with, and to the left the ones you share more keywords with. You can display up to 250 different competitors on the graph.
Competitors Section on Search Metrics.
6. Google Maps
Google Maps is great when you own a local business and you want to find your local competition. Go to Google Maps and search for your [“keyword” + Location], you will see all your competitors near you:
Google Maps results for “pet shops near Wimbledon”
In the example above, we search for “pet shops near Wimbledon” and Google shows similar businesses on the map as well as listings on the left side. If, for instance, due to proximity you want to include also New Malden as an area to find competitors, you can zoom out the map to expand your results across Wimbledon and New Malden. Otherwise, if you want to look into a more specific area of Wimbledon, you can zoom in the map to shrink your competitors’ results.
Now you have six different options for finding your search competitors. We suggest combining between free and paid tools when possible, so you can take advantage of the specific functionality / feature from each option:
Specific Feature or Benefit
Know in which position your competitors are ranking for your keywords directly in search results
Google "Related" Search Operator
Find your competitors by only typing your URL
See different competitors according to the keywords categories/tags you create
Paid with free options
It offers the widest number of competitors
It displays a graph showing up to 250 competitors with the number of keywords you share with each one
it shows competitors by searching for localisation
How often should you check who your online competitors are?
New competitors may come to the scene over time, so it’s important to stay on top of the domains within your search landscape. TL;DR? Identifying your search competitors isn’t a one-and-done exercise. Depending on your industry, you may see rapid or regular influxes of new competitors for your terms.
For example, Amazon began selling tickets to music concerts, West End theatre performances, and Off West-End shows two years ago. Shortly thereafter, this giant suddenly became a direct search competitor of tickets sellers’ websites (London Theatre, Ticketmaster, London Theatre Direct, etc). As the saying goes, an ounce of prevention is worth a pound of cure. In order to be prepared for new competition, we recommend you repeat this competitive research every quarter or at least twice a year.
By this point, you have already identified your online competition and have a list of five or six brands for you to monitor. The next step is to perform a competitive analysis, which will allow you to observe why they may outrank your site, and point you in the right direction to craft your own SEO strategy.
Try this process out, and let us know what you find! If there are other ways you like to find your online competitors, share your tips as well.
I’ve long thought that there was an opportunity to improve the way we think about internal links, and to make much more effective recommendations. I feel like, as an industry, we have done a decent job of making the case that internal links are important and that the information architecture of big sites, in particular, makes a massive difference to their performance in search (see: 30-minute IA audit and DistilledU IA module).
And yet we’ve struggled to dig deeper than finding particularly poorly-linked pages, and obviously-bad architectures, leading to recommendations that are hard to implement, with weak business cases.
I’m going to propose a methodology that:
Incorporates external authority metrics into internal PageRank (what I’m calling “local PageRank”) to take pure internal PageRank which is the best data-driven approach we’ve seen for evaluating internal links and avoid its issues that focus attention on the wrong areas
Allows us to specify and evaluate multiple different changes in order to compare alternative approaches, figure out the scale of impact of a proposed change, and make better data-aware recommendations
Current information architecture recommendations are generally poor
Over the years, I’ve seen (and, ahem, made) many recommendations for improvements to internal linking structures and information architecture. In my experience, of all the areas we work in, this is an area of consistently weak recommendations.
I have often seen:
Vague recommendations - (“improve your information architecture by linking more to your product pages”) that don’t specify changes carefully enough to be actionable
No assessment of alternatives or trade-offs - does anything get worse if we make this change? Which page types might lose? How have we compared approach A and approach B?
Lack of a model - very limited assessment of the business value of making proposed changes - if everything goes to plan, what kind of improvement might we see? How do we compare the costs of what we are proposing to the anticipated benefits?
This is compounded in the case of internal linking changes because they are often tricky to specify (and to make at scale), hard to roll back, and very difficult to test (by now you know about our penchant for testing SEO changes - but internal architecture changes are among the trickiest to test because the anticipated uplift comes on pages that are not necessarily those being changed).
In my presentation at SearchLove London this year, I described different courses of action for factors in different areas of this grid:
It’s tough to make recommendations about internal links because while we have a fair amount of data about how links generally affect rankings, we have less information specifically focusing on internal links, and so while we have a high degree of control over them (in theory it’s completely within our control whether page A on our site links to page B) we need better analysis:
The current state of the art is powerful for diagnosis
If you want to get quickly up to speed on the latest thinking in this area, I’d strongly recommend reading these three articles and following their authors:
A load of smart people have done a ton of thinking on the subject and there are a few key areas where the state of the art is powerful:
There is no doubt that the kind of visualisations generated by techniques like those in the articles above are good for communicating problems you have found, and for convincing stakeholders of the need for action. Many people are highly visual thinkers, and it’s very often easier to explain a complex problem with a diagram. I personally find static visualisations difficult to analyse, however, and for discovering and diagnosing issues, you need data outputs and / or interactive visualisations:
“we see that our top page is our contact page. That doesn’t look right!”
This is a symptom of a wider problem which is that any algorithm looking at authority flow within the site that fails to take into account authority flow into the site from external links will be prone to getting misleading results. Less-relevant pages seem erroneously powerful, and poorly-integrated pages that have tons of external links seem unimportant in the pure internal PR calculation.
In addition, I hinted at this above, but I find visualisations very tricky - on large sites, they get too complex too quickly and have an element of the Rorschach to them:
My general attitude is to agree with O’Reilly that “Everything looks like a graph but almost nothing should ever be drawn as one”:
All of the best visualisations I’ve seen are nonetheless full link-graph visualisations - you will very often see crawl-depth charts which are in my opinion even harder to read and obscure even more information than regular link graphs. It’s not only the sampling but the inherent bias of only showing links in the order discovered from a single starting page - typically the homepage - which is useful only if that’s the only page on your site with any external links. This Sitebulb article talks about some of the challenges of drawing good crawl maps:
But by far the biggest gap I see is the almost total lack of any way of comparing current link structures to proposed ones, or for comparing multiple proposed solutions to see a) if they fix the problem, and b) which is better. The common focus on visualisations doesn't scale well to comparisons - both because it’s hard to make a visualisation of a proposed change and because even if you can, the graphs will just look totally different because the layout is really sensitive to even fairly small tweaks in the underlying structure.
Our intuition is really bad when it comes to iterative algorithms
All of this wouldn’t be so much of a problem if our intuition was good. If we could just hold the key assumptions in our heads and make sensible recommendations from our many years of experience evaluating different sites.
Unfortunately, the same complexity that made PageRank such a breakthrough for Google in the early days makes for spectacularly hard problems for humans to evaluate. Even more unfortunately, not only are we clearly bad at calculating these things exactly, we’re surprisingly bad even at figuring them out directionally. [Long-time readers will no doubt see many parallels to the work I’ve done evaluating how bad (spoiler: really bad) SEOs are at understanding ranking factors generally].
I think that most people in the SEO field have a high-level understanding of at least the random surfer model of PR (and its extensions like reasonable surfer). Unfortunately, most of us are less good at having a mental model for the underlying eigenvector / eigenvalue problem and the infinite iteration / convergence of surfer models is troublesome to our intuition, to say the least.
I explored this intuition problem recently with a really simplified example and an unscientific poll:
The results were unsurprising - over 1 in 5 people got even a simple question wrong (the right answer is that a lot of the benefit of the link to the new page flows on to other pages in the site and it retains significantly less than an Nth of the PR of the homepage):
I followed this up with a trickier example and got a complete lack of consensus:
The right answer is that it loses (a lot) less than the PR of the new page except in some weird edge cases (I think only if the site has a very strange external link profile) where it can gain a tiny bit of PR. There is essentially zero chance that it doesn’t change, and no way for it to lose the entire PR of the new page.
Most of the wrong answers here are based on non-iterative understanding of the algorithm. It’s really hard to wrap your head around it all intuitively (I built a simulation to check my own answers - using the approach below).
All of this means that, since we don’t truly understand what’s going on, we are likely making very bad recommendations and certainly backing them up and arguing our case badly.
Doing better part 1: local PageRank solves the problems of internal PR
In order to be able to compare different proposed approaches, we need a way of re-running a data-driven calculation for different link graphs. Internal PageRank is one such re-runnable algorithm, but it suffers from the issues I highlighted above from having no concept of which pages it’s especially important to integrate well into the architecture because they have loads of external links, and it can mistakenly categorise pages as much stronger than they should be simply because they have links from many weak pages on your site.
In theory, you get a clearer picture of the performance of every page on your site - taking into account both external and internal links - by looking at internet-wide PageRank-style metrics. Unfortunately, we don’t have access to anything Google-scale here and the established link data providers have only sparse data for most websites - with data about only a fraction of all pages.
Even if they had dense data for all pages on your site, it wouldn’t solve the re-runnability problem - we wouldn’t be able to see how the metrics changed with proposed internal architecture changes.
What I’ve called “local” PageRank is an approach designed to attack this problem. It runs an internal PR calculation with what’s called a personalization vector designed to capture external authority weighting. This is not the same as re-running the whole PR calculation on a subgraph - that’s an extremely difficult problem that Google spent considerable resources to solve in their caffeine update. Instead, it’s an approximation, but it’s one that solves the major issues we had with pure internal PR of unimportant pages showing up among the most powerful pages on the site.
Here’s how to calculate it:
The next stage requires data from an external provider - I used raw mozRank - you can choose whichever provider you prefer, but make sure you are working with a raw metric rather than a logarithmically-scaled one, and make sure you are using a PageRank-like metric rather than a raw link count or ML-based metric like Moz’s page authority:
You need to normalise the external authority metric - as it will be calibrated on the entire internet while we need it to be a probability vector over our crawl - in other words to sum to 1 across our site:
We then use the NetworkX PageRank library to calculate our local PageRank - here’s some outline code:
What’s happening here is that by setting the personalization parameter to be the normalised vector of external authorities, we are saying that every time the random surfer “jumps”, instead of returning to a page on our site with uniform random chance, they return with probabilities proportional to the external authorities of those pages. This is roughly like saying that any time someone leaves your site in the random surfer model, they return via the weighted PageRank of the external links to your site’s pages. It’s fine that your external authority data might be sparse - you can just set values to zero for any pages without external authority data - one feature of this algorithm is that it’ll “fill in” appropriate values for those pages that are missing from the big data providers’ datasets.
In order to make this work, we also need to set the alpha parameter lower than we normally would (this is the damping parameter - normally set to 0.85 in regular PageRank - one minus alpha is the jump probability at each iteration). For much of my analysis, I set it to 0.5 - roughly representing the % of site traffic from external links - approximating the idea of a reasonable surfer.
There are a few things that I need to incorporate into this model to make it more useful - if you end up building any of this before I do, please do let me know:
Include top mR pages (or even all pages with mR) - even if they’re not in the crawl that starts at the homepage
You could even use each of these as a seed and crawl from these pages
Use the weight parameter in NetworkX to weight links by type to get closer to reasonable surfer model
The extreme version of this would be to use actual click-data for your own site to calibrate the behaviour to approximate an actual surfer!
Doing better part 2: describing and evaluating proposed changes to internal linking
After my frustration at trying to find a way of accurately evaluating internal link structures, my other major concern has been the challenges of comparing a proposed change to the status quo, or of evaluating multiple different proposed changes. As I said above, I don’t believe that this is easy to do visually as most of the layout algorithms used in the visualisations are very sensitive to the graph structure and just look totally different under even fairly minor changes. You can obviously drill into an interactive visualisation of the proposed change to look for issues, but that’s also fraught with challenges.
So my second proposed change to the methodology is to find ways to compare the local PR distribution we’ve calculated above between different internal linking structures. There are two major components to being able to do this:
Efficiently describing or specifying the proposed change or new link structure; and
Effectively comparing the distributions of local PR - across what is likely tens or hundreds of thousands of pages
How to specify a change to internal linking
I have three proposed ways of specifying changes:
1. Manually adding or removing small numbers of links
Although it doesn’t scale well, if you are just looking at changes to a limited number of pages, one option is simply to manipulate the spreadsheet of crawl data before loading it into your script:
2. Programmatically adding or removing edges as you load the crawl data
Your script will have a function that loads the data from the crawl file - and as it builds the graph structure (a DiGraph in NetworkX terms - which stands for Directed Graph). At this point, if you want to simulate adding a sitewide link to a particular page, for example, you can do that - for example if this line sat inside the loop loading edges, it would add a link from every page to our London SearchLove page:
You don’t need to worry about adding duplicates (i.e. checking whether a page already links to the target) because a DiGraph has no concept of multiple edges in the same direction between the same nodes, so if it’s already there, adding it will do no harm.
Removing edges programmatically is a little trickier - because if you want to remove a link from global navigation, for example, you need logic that knows which pages have non-navigation links to the target, as you don’t want to remove those as well (you generally don’t want to remove all links to the target page). But in principle, you can make arbitrary changes to the link graph in this way.
3. Crawl a staging site to capture more complex changes
As the changes get more complex, it can be tough to describe them in sufficient detail. For certain kinds of changes, it feels to me as though the best way to load the changed structure is to crawl a staging site with the new architecture. Of course, in general, this means having the whole thing implemented and ready to go, the effort of doing which negates a large part of the benefit of evaluating the change in advance. We have a secret weapon here which is that the “meta-CMS” nature of our ODN platform allows us to make certain changes incredibly quickly across site sections and create preview environments where we can see changes even for companies that aren’t customers of the platform yet.
For example, it looks like this to add a breadcrumb across a site section on one of our customers’ sites:
There are a few extra tweaks to the process if you’re going to crawl a staging or preview environment to capture internal link changes - because we need to make sure that the set of pages is identical in both crawls so we can’t just start at each homepage and crawl X levels deep. By definition we have changed the linking structure and therefore will discover a different set of pages. Instead, we need to:
Crawl both live and preview to X levels deep
Combine into a superset of all pages discovered on either crawl (noting that these pages exist on both sites - we haven’t created any new pages in preview)
Make lists of pages missing in each crawl and crawl those from lists
Once you have both crawls, and both include the same set of pages, you can re-run the algorithm described above to get the local PageRanks under each scenario and begin comparing them.
How to compare different internal link graphs
Sometimes you will have a specific problem you are looking to address (e.g. only y% of our product pages are indexed) - in which case you will likely want to check whether your change has improved the flow of authority to those target pages, compare their performance under proposed change A and proposed change B etc. Note that it is hard to evaluate losers with this approach - because the normalisation means that the local PR will always sum to 1 across your whole site so there always are losers if there are winners - in contrast to the real world where it is theoretically possible to have a structure that..
Being new to SEO is tricky. As a niche market within a niche market there many tools and resources unfamiliar to most new professionals. And with so much to learn it is nearly impossible to start real client work without first dedicating six months exclusively to industry training. Well...that’s how it may seem at first.
While it may be intimidating, investigating real-world problems is the best way to learn SEO. It exposes you to industry terminology, introduces you to valuable resources and gets you asking the right questions.
As a fairly new Analyst at Distilled, I know from experience how difficult it can be to get started. So here’s a list of common SEO analyses and supporting tools that may help you get off on the right foot.
Reviewing on-page elements
Page elements are essential building blocks of any web page. And pages with missing or incorrect elements risk not being eligible for search traffic. So checking these is necessary for identifying optimization opportunities and tracking changes. You can always go to the HTML source code and manually identify these problems yourself, but if you’re interested in saving a bit of time and hassle, Ayima’s Google Chrome extension Page Insights is a great resource.
This neat little tool identifies on-page problems by analyzing 24 common on-page issues for the current URL and comparing them against a set of rules and parameters. It then provides a list of all issues found, grouped into four priority levels: Errors, Warnings, Notices and Page Info. Descending from most to least severe, the first 3 categories (Errors, Warnings & Notices) identify all issues that could impact organic traffic for the page in question. The last category (Page Info) provides exact information about certain elements of the page.
For every page you visit Page Insights will give a warning next to its icon, indicating how many vulnerabilities were found on the page.
Clicking on the icon gives you a drop-down listing the vulnerabilities and page information found.
What makes this tool so useful is that it also provides details about each issue, like how it can cause harm to the page and correction opportunities. In this example, we can see that this web page is missing an H1 tag, but in this case, could be corrected by adding anH1 tag around the page’s current heading (which is not coded as an H1).
In a practical setting, Page Insights is great for quickly identify common on-page issues that should be fixed to ensure best SEO practice.
Measuring the load functionality and speed of a page is an important and common practice since both metrics are correlated with user experience and are highly valued by search engines. There are a handful of tools that are applicable to this task but because of its large quantity of included metrics, I recommend using WebPagetest.org.
Emulating various browsers, this site allows users to measure the performance of a web page from different locations. After sending a real-time page request, WebPagetest provides a sample of three tests containing request details, such as the complete load time, the load time breakdown of all page content, and a final image of the rendered page. There are various configuration settings and report types within this tool, but for most analyses, I have found that running a simple test and focusing on the metrics presented in the Performance Results supply ample information.
There are several metrics presented in this report, but data provided in Load Time and First Byte work great for most checks. Factoring in Google’s suggestion to have desktop load time no greater than 2 seconds and a time to first byte of 200ms or less, we can gauge whether or not a page’s speed is properly optimized.
Prioritizing page speed performance areas
Knowing if a page needs to improve its performance speed is important, but without knowing what areas need improving you can’t begin to make proper corrections. Using WebPagetest in tandem with Google’s PageSpeed Insights is a great solution for filling in this gap.
Free for use, this tool measures a page’s desktop and mobile performance to evaluate whether it has applied common performance best practices. Scored on a scale of 0-100 a page’s performance can fall into one of three categories: Good, Needs Work or Poor. However, the key feature of this tool, which makes it so useful for page speed performance analysis, is its optimization list.
Located below the review score, this list highlights details related to possible optimization areas and good optimization practices currently in place on the page. By clicking the “Show how to fix” drop down for each suggestion you will see information related to the type of optimization found, why to implement changes and specific elements to correct.
In the image above, for example, compressing two images to reduce the amount bytes that need to be loaded can improve this web page’s speed. By making this change the page could expect a reduction in image byte size by 28%.
Using WebPagetest and PageSpeed Insights together can give you a comprehensive view of a page’s speed performance and assist in identifying and executing on good optimization strategies.
How Googlebot (or Bingbot or MSNbot) crawls and renders a page can be completely different from what is intended, and typically occurs as a result of the crawler being blocked by a robots.txt file. If Google sees an incomplete or blank page it assumes the user is having the same experience and could affect how that page performs in the SERPs. In these instances, the Webmaster tool Fetch as Google is ideal for identifying how Google renders a page.
Located in Google Search Console, Fetch as Google allows you to test if Googlebot can access pages of a site, identify how it renders the page and determines if any resources are blocked from the crawler.
When you look up a specific URL (or domain) Fetch as Google gives you two tabs of information: fetching, which displays the HTTP response of the specified URL; and rendering, which runs all resources on the page, provides a visual comparison of what Googlebot sees against what (Google estimates) the user sees and lists all resources Googlebot was not able to acquire.
For an analysis application, the rendering tab is where you need to look. Begin by checking the rendering images to ensure both Google and the user are seeing the same thing. Next, look at the list to see what resources were unreachable by Googlebot and why. If the visual elements are not displaying a complete page and/or important page elements are being blocked from Googlebot, there is an indication that the page is experiencing some rendering issues and may perform poorly in the search engine.
Additional tools for investigating rendering issues:
Quality backlinks are extremely important for making a strong web page, as they indicate to search engines a page’s reliability and trustworthiness. Changes to a backlink profile could easily affect how it is ranked in the SERPs, so checking this is important for any webpage/website analysis. A testament to its importance, there are several tools dedicated to backlinks analytics. However, I have a preference for the site Ahrefs due to its comprehensive yet simple layout, which makes it great for on-the-spot research.
An SEO tool well known for its backlink reporting capabilities, Ahrefs measures several backlink performance factors and displays them in a series of dashboards and graphs. While there is plenty to review, for most analysis purposes I find the “Backlinks” metric and “New & lost backlinks” graph to be the best places to focus.
Located under the Site Explorer tab, “Backlinks” identifies the total number of backlinks pointing to a target website or URL. It also shows the quantitative changes in these links over the past 7 days with the difference represented by either a red (negative growth) or green (positive growth) subscript. In a practical setting, this information is ideal for providing quick insight into current backlink trend changes.
Under the same tab, the “New & lost backlinks” graph provides details about the total number of backlinks gained and lost by the target URL over a period of time.
The combination of these particular features works very well for common backlink analytics, such as tracking backlinks profile changes and identifying specific periods of link growth or decline.
This is only a sample of tools you can use for your SEO analyses and there are plenty more, with their own unique strengths and capabilities, available to you. So make sure to do your research and play around to find what works.
And if you are to take away only one thing from this post, just remember that as you work to build your own personal toolbox what you choose to include should best work for your needs and the needs of your clients.
In plain English, you’ll be able to write code which changes the content, headers, look, feel and behaviour of your pages via the Cloudflare CDN. You can do this without making development changes on your servers, and without having to integrate into existing site logic.
Why is this helpful?
As SEOs, we frequently work with sites which need technical improvements or changes. But development queues are often slow, resources restricted, and website platforms complex to change. It’s hard to get things changed or added.
Service workers on the edge
Cloudflare, like other CDNs, has servers all over the world. When users request a URL on your website, they’re automatically routed to the nearest geographic ‘edge node’, so that users access the site via a fast, local connection. This is pretty standard stuff.
What’s new, however, is that you can now write code which runs at those edge nodes, which allows fine-grained control over how the page is presented to the end user based on their location, or using any logic you care to specify.
With full control over the response from the CDN, it’s possible to write scripts which change title tags, alter canonical URLs, redirect the user, change HTTP headers, or which add completely new functionality; you can adapt, change, delete, build upon or build around anything in the content which is returned from the server.
It’s worth noting that other platforms, like AWS, already launched something like this in July 2017. The concept of making changes at the edge isn’t completely new, but AWS uses a different approach and technology stack.
Cloudflare’s solution is based on the Service Worker API (as opposed to Node.js), which might look like a more future-proof approach.
Service workers are the current framework of choice for progressive web apps (PWAs), managing structured markup, and playing with new/emerging formats as Google (and the wider web) moves from favouring traditional websites to embracing more app-like experiences. That makes it a good skill set to learn, to use, and potentially to recycle existing code and solutions from elsewhere in your ecosystem.
That PWAs look likely to be the next (arguably, the current) big thing means that service workers aren't going anywhere anytime soon, but Node.js might just be the current flavour of the month.
Cloudflare provides a sandbox for you to test and visualise changes on any website, though it’s unclear whether this is part of their launch marketing or something which will be around for the long-term (or a component of the editor/deployment system itself).
That’s a lot of power to play with, and I was keen to explore what it looks like in practice.
It took me just a few minutes to modify one of the scripts on their announcement page to add the word ‘awesome’ (in a pleasing shade of orange) to Distilled’s homepage. You can check out the code here.
Service workers can be complex to work with, too. For example, all of your changes are asynchronous; they all run in parallel, at the same time. That makes things lightning fast, but it means that some complex logic which relies on specific ordering or dependencies might be challenging to write and maintain.
And with all of this, there’s also no nice WYSIWYG interface, guides or tutorials (other than general JS or service worker questions on StackOverflow). You’ll be flying by the seat of your pants, spending most of your time trying to work out why your code doesn’t work. And if you need to turn to your developers for help, you’re back at our initial problem - they’re busy, they have other priorities, and you’re fighting for resources.
A meta CMS is not a toy
As we increasingly find ourselves turning to workarounds for long development cycles, issues which “can’t be fixed”, and resolving technical challenges, it’s tempting to see solutions like Google Tag Manager and Cloudflare Workers as viable solutions.
If we can’t get the thing fixed, we can patch over it with a temporary solution which we can deploy ‘higher up the stack’ (a level ‘above’/before the CMS), and perhaps reprioritise and revisit the actual problem at a later date.
You can fix your broken redirects. You can migrate to HTTPS and HTTP/2. You can work through all those minor template errors which the development team will never get to.
But as this way of working becomes habit, it’s not unusual to find that the solutions we’re using (whether it’s Google Tag Manager, Cloudflare, or our own ODN) take on the characteristics of ‘Meta CMSs’; systems which increasingly override our templates, content and page logic, and which use CMS-like logic to determine what the end user sees.
Over time, we build up more and more rules and replacement, until we find that there’s a blurring of lines between which bits of our website and content we manage in each platform.
This creates a bunch of risks and challenges, such as:
What happens when the underlying code changes, or when rules conflict? If you’re using a tag manager or CDN to layer changes ‘on top’ of HTML code and pages, what happens when developers make changes to the underlying site logic?
More often than not, the rules you’ve defined to layer your changes break, with potentially disastrous consequences. And when you’ve multiple rules with conflicting directives, how do you manage which ones win?
When you’ve got lots of rules or particularly complex scripts, you’ll need a logging or documentation process to provide human-friendly overviews of how all of the moving parts work and interact.
Who logs what’s where? If conflicts arise, or if you want to update or make new changes you’ll need to edit or build on top of your existing systems. But how do you know which systems - your CMS or your meta CMS - are controlling which bits of the templates, content and pages you want to modify?
You’ve got rules and logic in multiple places, and it’s a headache keeping track.
When the CEO asks why the page he’s looking at is broken, how do you begin to work out why, and where, things have gone wrong?
How do you do QA and testing? Unless your systems provide an easy way to preview changes, and allow you to expose testing URLs for the purposes of QA, browser testing and similar, you’ve got a system with a lot of power and very little quality control. At the moment, it doesn’t look like Cloudflare supports this.
How do you manage access and versioning? As your rules change, evolve and layer over time, you’ll need a way of managing version control, change logging, and access/permissions. It’s unclear if, or how Cloudflare will attack this at the moment, but the rest of their ecosystem is generally lacking in this regard.
How do you prevent accidental exposure/caching/PII etc? When you’ve full access to every piece of data flowing to or from the server, you can very easily do things which you probably shouldn’t - even accidentally. It doesn’t take much to accidentally store, save, or expose private user information, credit card transaction details, and other sensitive content.
In general then, relying overly on your CDN as a meta CMS feels like a risky solution. It’s good for patching over problems, but it’s going to cause operational and organisational headaches.
That’s not to say that it’s not a useful tool, though. If you’re already on Cloudflare, and you have complex challenges which you can resolve as a one-off fix using Cloudflare Workers, then it’s a great way to bypass the issue and get some easy wins.
Alternatively, if you need to execute geographically specific content, caching or redirect logic (at the closest local edge node to the user), then this is a really great tool - there are definitely use cases around geographically/legally restricted content where this is the perfect tool for the job.
Otherwise, it feels like trying to fix the problem is almost always going to be the better solution. Even if your developers are slow, you’re better off addressing the underlying issues at their source than patching on layers of (potentially unstable) fixes over the top.
Sometimes, Cloudflare Workers will be an elegant solution - more often than not, you should try to fix things the old-fashioned way.
ODN as a meta CMS
Except, there may be an exception to the rule.
If you could have all of the advantages of a meta CMS, but with provisions for avoiding all of the pitfalls I’ve identified - access and version control, intuitive interfaces, secure testing processes, and documentation - you could solve all of your technical SEO challenges overnight, and they’d stay solved.
And whilst I want to stress that I’m not a sales guy, we have a solution.
Our ‘Optimisation Delivery Network’ product (Distilled ODN for short) does all of this, with none of the disadvantages we’ve explored.
We built, and market our platform as an SEO split-testing solution (and it’s a uniquely awesome way to measure the effectiveness of on-page SEO changes at scale), but more interestingly for us, it’s essentially a grown-up meta CMS.
It works by making structured changes to pages, between the request to the server and the point where the page is delivered back to the user. It can do everything that Google Tag Manager or Cloudflare can do to your pages, headers, content and response behaviour.
And it has a friendly user interface. It’s enterprise-grade, it’s scalable, safe, and answers to all of the other challenges we’ve explored.
We have clients who rely on ODN for A/B testing their organic search traffic and pages, but many of these also use the platform to just fix stuff. Their marketing teams can log in, define rules and conditions, and fix issues which it’d typically take months (sometimes years) for development teams to address.
So whilst ODN still isn’t a perfect fix - if you’re in need of a meta CMS then something has already gone wrong upstream - it’s at least a viable, mature and sophisticated way of bypassing clunky development processes and delivering quick, tactical wins.
I expect we’ll see much more movement in the meta CMS market in the next year or so, especially as there are now multiple players in the space (including Amazon!); but how viable their products will be - if they don’t have usable interfaces and account for organisational/operational challenges - is yet to be seen.
The publishing industry has been claiming victory recently in a long-running disagreement with Google over how subscription content (i.e. content that sits behind a paywall or registration wall) should appear in their search results:
There’s a lot of confusion around the new policy which Google has announced, and a lack of clarity in how the media, publishers, and Google itself is reporting and discussing the topic.
Google’s own announcement is typically obtuse in its framing (“Enabling more high-quality content for users”) but has plenty enough information for those of us who spend much of our professional lives interpreting the search engines’ moves to figure out what’s going on.
The short version is that what’s being reported as “ending” the first click free policy is really about extending it. There are some parts of the extension that publishers have asked for, but the key concession Google is demanding - that publishers label the paywalled content in a machine-readable way - will lead to further weakening of the publishers’ positions.
To run through the full analysis, I’m going to start with some background - but if you know all about the history, go ahead and jump ahead to my new analysis and conclusions.
The background - what was First Click Free (FCF)
In the early days of Google, they indexed only content that was publicly-available on the open web to all users and crawlers. They did this by visiting all pages on the web with their own crawler - named Googlebot. At various points, they encountered behaviour that they came to label cloaking: when websites showed different content to Googlebot than to everyone else. This was typically done to gain a ranking advantage - for example to stuff a load of text onto a page containing words and phrases that didn’t appear in the article users were shown with the objective of appearing in searches for those words and phrases.
Google disliked this practice both because it messed up their index, and - the official line - because it resulted in a poor user experience if someone clicked on one of these articles and then discovered content that was not relevant to their search. As a result, they declared cloaking to be against their guidelines.
In parallel, publishers were working to figure out their business models on the web - and while many went down the route of supporting their editorial business with advertising, many wished to charge a subscription fee and allow only paying customers to access their content.
The conundrum this presented was in acquisition of those customers - how would people find the paywalled content? If Googlebot was blocked at the paywall (like all other logged-out users) - which was the only legitimate publisher behaviour that wasn’t cloaking - then none of those pages would rank for anything significant, as Googlebot would find no real content on the page.
Google’s solution was a program they called First Click Free (FCF) which they rolled out first to news search and then to web search in 2008. This policy allowed publishers to cloak legitimately - to show Googlebot the full content of pages that would be behind the paywall for regular users by identifying the Google crawler and specifically treating it differently. It allowed this behaviour on the condition that the publishers allow any user who clicked on a Google search result to access the specific article they had clicked through to read whether they had a subscription or not. After this “first click” which had to be free, the publisher was welcome to enforce the paywall if the user chose to continue to request subsequent pages on the site.
Problems with First Click Free and the backlash
The biggest problem with FCF was that it created obvious holes in publishers’ paywalls and led to the open secret that you could access any article you wanted on many major newspaper sites simply by googling the headline and clicking through. While complying with Google’s rules, there was little the publishers could do about this (they were allowed to implement a cap - but required to allow at least 3 articles per day which is beyond the average consumption of most paywalled sites by most users - and effectively constituted no cap).
Publishers also began to resent more generally that Google was effectively determining their business models. While I have always been concerned about exactly what will continue to pay for journalism, I always had little sympathy for the argument that Google was forcing publishers to do anything. Google was offering a way of cloaking legitimately if publishers were prepared to enable FCF. Publishers were always welcome to reject that offer, not enable FCF, and also keep Googlebot out of their paywalled content (this was the route that The Times took).
Earlier this year, the Wall Street Journal pulled out of FCF, and reportedly saw a drop in traffic, but an increase in subscriptions.
The new deal is really an expansion of FCF
The coverage has been almost-exclusively describing what’s happening as Google ending the FCF program whereas it really sounds more like an expansion. Whereas before Google offered only one legitimate way of compensating for what would otherwise be cloaking, they are now offering two options:
Metering - which includes the option previously called FCF - requires publishers to offer Google’s users some number of free clicks per month at their own discretion - but now also allowing publishers to limit how many times a single user gets free content after clicking through from Google
Lead-in - which shows users some section or snippet of the full article before requiring registration or payment (this is how thetimes.co.uk implements its paywall at the moment - so under the new rules they would now legitimately be able to allow Googlebot access to the full normally-paywalled content subject to my important notes below)
Google is imposing a critical new condition
However, both of these options come with a new limitation: in order to take part in the expanded scheme they now call Flexible Sampling, publishers must mark up content that will be hidden from non-subscribers using machine-readable structured markup called JSON-LD. Structured markup is a machine-readable way of providing more information and context about the content on a page - and in this case it enables Google to know exactly which bits of content Googlebot is getting to see only because it’s Googlebot (and the publisher is engaging in Flexible Sampling) and what will actually be visible to users when they click through.
And here’s the rub.
This new requirement is listed clearly in Google’s announcement but is getting little attention in the mainstream coverage - probably because it’s both a bit technical, and because it probably isn’t obvious what difference it makes to publishers beyond a bit of development work(*).
To me, though, this requirement screams that Google wants to do the same things they’ve done with other forms of structured markup - namely:
Present them differently in the search results
Aggregate and filter them
(*) Incidentally, the technical requirement that the JSON-LD markup declare the CSS selector for the paywalled content is one that we at Distilled predict is going to present maintenance nightmares for many publishers - it essentially means that any time a publisher makes a visual change to the user interface on any of their article pages, they’re going to need to check that they haven’t broken their compliance with the new Flexible Sampling program. These are often considerations of different teams, and it is very likely that many publishers will accidentally break this regularly in ways that are not obvious to them or their users. It remains to be seen how Google will treat such violations.
1. I’m convinced Google will label paywalls in the search results
My thinking here is that:
Hard paywalls are already labelled in Google News
Many other forms of structured markup are used to change the display in the search results (probably the most obvious to most users is the ratings stars that appear on many product searches - which come from structured markup on the retailers’ sites)
Especially in the case of a hard paywall with only a snippet accessible to most users, it’s a pretty terrible user experience to land on a snippet of content and a signup box (much like you see here if you’re not a subscriber to The Times) in response to most simple searches. Occasionally a user might be interested in taking out a new subscription - but rarely to read the single article they’re searching for right now
Point 3 is the most critical (1 & 2 simply show that Google can do this). Given how many sites on the web have a paywall, and how even the most engaged user will have a subscription to a handful at most, Google knows that unlabelled hard paywalls (even with snippets) are a terrible user experience the majority of the time.
I fully expect therefore to see results that look something like this:
Allow them to offer a scheme (“flexible sampling”) that is consistent with what publishers have been demanding
Let publishers claim a “win” against big, bad Google
Enable the cloaking that lets Googlebot through even hard paywalls (all but the most stringent paywalls have at least a small snippet for non-logged-in users to entice subscriptions)
Avoid having to remove major media sites from the search results or demote them to lower rankings
And yet, by labelling them clearly, get to the point that pretty much only users who already have a subscription to a specific site ever click on the paywalled results (the number of subscriptions you already have is small enough that you are always going to remember whether you have access to any specific site or not)
My prediction is that the end result of this looks more like what happened when the WSJ pulled out of FCF - reportedly good for the WSJ, but likely very bad for less-differentiated publishers - which is something they could already do. In other words, publishers have gained very little in this deal, while Google is getting them to expend a load of energy and development resource carefully marking up all their paywalled content for Google to flag it clearly in the search results. (Note: Google VP for News, Richard Gingras, has already been hinting at some of the ways this could happen in the Google News onebox).
2. What does aggregation look like?
Once Google can identify paywall content at scale across the web (see the structured markup information above) they open up a number of interesting options:
Filtering subscription content out of a specific search and seeing only freely-available content
Filtering to see only subscription content - perhaps from user-selected publications (subscriptions you own)
Possible end-game: connecting programmatically to subscription APIs in order to show you free content and content you have already got a subscription for, automatically
Offering a bundle (Chris Dixon on why bundles make economic sensefor both buyers and sellers). What if you could pay some amount that was more than a single subscription, but less than two, that got you access to 5 or 6 major media sites. It’s very likely that everyone (except publishers outside the bundle!) would be better off. Very few players have the power to make such a bundle happen. It’s possible that Google is one of those players.
Under scenario #3, Google would know who had access to the bundle and could change the display in the search results to emphasise the “high quality, paid” content that a particular searcher had access to - in addition to the free content and other subscription sites outside the bundle. Are we going to see a Spotify for Publishers? We should all pay close attention to the “subscription support” tools that Google announced alongside the changes to FCF. Although these are starting with easy payment mechanisms, the paths to aggregation are clear.
Ben Thompson has been writing a lot recently about aggregators (that link is outside his paywall - a subscription I wholeheartedly recommend - I look forward to seeing his approach to the new flexible sampling options on his own site, as well as his opinions). Google is the prototypical super aggregator - making huge amounts of money by aggregating others’ work with effectively zero transaction costs on both the acquisition of their raw materials and their self-service sale of advertising. Are they about to aggregate paid subscription content as well?
Publishers are calling this a win. My view is that the new Google scheme offers:
Something that looks very like what was in place before (“metering"”)
And demands in return a huge amount of structured data which will cement Google’s position, allow them to maintain an excellent user experience without sending more traffic to publishers, and start them down a path to even more aggregation.
If paywalls are to be labelled in the search results, publishers will definitely see a drop in traffic compared to what they received under FCF. The long-term possibility of a “Spotify for Publishers” bundle will likely be little solace in the interim.
Are you a publisher?
If you’re wondering what you have to do as a result of these changes, or what you should do as the landscape shifts, don’t hesitate to get in touch and we will be happy to discuss your specific situation.
This post was written by Courtney Louie, a student at Washington State University. During her time in the Seattle consulting team, Courtney tried her hand at many tasks that make up the day-to-day workings of an analyst or consultant. One of those was keyword research, and she decided to share here learning process for KW research using Ahrefs...
Keyword Research is a fundamental principle of SEO. As marketers, we know by now that it can help drive traffic to our sites and improve organic search rankings. With the vast amount of tools available, it can be hard to find the right one. While doing keyword research for a client, Distilled Analyst Lydia Gilbertson introduced me to arguably the best tool out there - Ahrefs’ Keywords Explorer.
This tutorial will walk you through the keyword research process for a client and show you how to take actionable decisions from the results by using Ahrefs’ tools.
Compared to their competitors, Ahrefs ease of use and minimal navigation required to achieve the desired results is what sets it apart.
Sure, other industry tools will give you similar data and metrics, but Ahrefs has the most comprehensive set. The variety of tools allows you to handle all your website’s metrics in one place. This is where the competitors fall short.
The client used for our example
An e-commerce site looking to improve keyword targeting on their site’s top 20 product pages. A nice, small batch of URLs that we can dive into finding trends and do competitor research around.
Selecting keywords to research
Start out with what you know
When starting research for the client, look at the product pages and examine the title and H1 tags. These give a good sign of what the page is trying to target and might currently be ranking for (Note: also where the most improvements can be made. We’ll touch on that later). It is helpful to pull this data from a Screaming Frog crawl and have the titles and H1s in a list. Seed keywords can be made by directly using a part of the page’s current title or H1.
Think like a user
Another effective technique is to think like a user or customer of your site would. For example, say a customer is looking to buy an aluminum water bottle. The first searches that come to mind are “best aluminum water bottles” or “aluminum water bottle brands”. Apply this method to some of the pages you are wanting to find keywords for. This technique is especially helpful when trying to find long-tail keywords. As always, put the ones you come up with on a separate sheet list for easy inputting into Ahrefs.
See what your competitors are ranking for
Ahrefs has a great tool for this called Site Explorer. It allows you to insert any URL and automatically generates data on the keywords it currently ranks for. In the case of the client example, one of their main competitors is a major food and gift basket brand. Let’s walk through this:
By pasting the URL into Site Explorer and keeping the “domain” setting as is, you will get keyword data on the whole domain, and not just the particular URL. Ignoring everything else on the Site Explorer page, you should then navigate to the Organic Keywords section.
Below you see a list of all the keywords the competitor is ranking for. By default, this list is sorted by traffic, but clicking any of the metrics you can sort it by that metric; either highest or lowest. There will always be self-referencing (or branded) keywords but you can remove these after exporting the data. Focus on the keywords that apply to your site.
Getting the data
I recommend combining the three methods above to generate a list of keywords. The end goal for most keyword research is to identify and aid with the searcher’s intent. This can be done by making data driven decisions, but first, it’s important to understand what the data means.
Taking the potential keywords, input them into Keyword Explorer to analyze the data.
Paste keywords into the insert box and kept the search data country the United States. It is important to change this based on where your target audience is located. Notice at the bottom you can also select New List which will add/save these keywords into a unique list for referencing later. These lists are helpful if you want to have different saved sets of keywords for parts of your website, pages or clients in general.
The overview and metrics tabs are equally important. But for the task at hand, let’s focus on the Metrics tab. The overview tab aggregates that data from all the inputted keywords and gives a summary of it. Whereas the metrics tab breaks down the data for each keyword.
Keywords explorer metrics
Looking at the metrics tab, let’s focus on three metrics that are the most helpful and not as self-explanatory:
This is an Ahrefs specific score that calculates how hard it will be to rank in the top 10 results for a keyword. The score is calculated by how many referring domains the current top 10 results have for that specific keyword. It’s great for judging if it is worth targeting or not. The lower the KD score, the less difficult it is to rank for it.
Volume is the amount of searches per month, averaged over the last 12 months. This metric helps you to determine how “popular” the keyword is. Be careful to not solely base your keyword research on volume though. This can cause problems with not accurately reflecting the user’s intent.
This is a great metric that (at the time of writing this post) is specific to Ahrefs. It shows tiny icons next to the keyword that will explain what appears on the search engine page. For instance a related question, image pack, knowledge panel, shopping result etc.
Using these three metrics for the client’s 20 keyword list, meaningful insights can be gained. The balance between finding a keyword with an achievable KD Score and substantial volume is imperative. Use this combination of this data to your advantage.
Most marketers are so driven to increase volume and clicks, that they don’t realize the keyword they are targeting is nearly impossible to rank for in the top ten since you would need, say, 50 referring domains. They are blinded by the fact that the keyword has the highest volume. There might be a keyword that has 10-20% less volume but only needs five referring domains to rank, which is obviously a more realistic choice.
Helpful results to find additional keywords
After completing the above tasks, look at the list and use the three metrics to determine if your pre-selected keywords are realistically worth targeting. You now want to find additional keywords, based off the ones on your list that have a better balance of metrics.
Here are two results to focus on:
Both of these can be found by clicking on an individual keyword from your list. This expands the data on that specific keyword.#
Butter toffee is a keyword originally found for the client example. You can see that the parent topic is ‘butter toffee recipe’. You might be thinking why is this the parent topic? It’s longer and more complex. This is because parent topics are found by taking the highest ranking page for your original keyword and taking the ‘best’ keyword that page ranks for. ‘Best’ refers to a combination of volume and traffic potential.
If you click on the parent topic, it will expand the data on that word. It will also show you all the keywords that top page is ranking for.
The additional keywords listed give good ideas for ones that might serve the user intent better and have an equal balance of metrics.
For instance, with ‘butter toffee recipe’, it’s clear that the client’s page (which only targets one keyword) can rank for multiple keywords since there are many relevant ones within this list.
The results from this are broken into three categories: having same terms, also rank for and search suggestions.
Each category provides you with a different set of keyword ideas that are relevant or an expansion of your original keyword. To find out more about how each one of these categories is populated, check out this article from Ahrefs.
Actionable to-dos after finding keywords
After creating a list of the best keywords (using the practices above) for the client’s product pages, it’s time to begin optimizing for them. Focuse on three main on-page optimizations that can easily be changed with the help of DevOps or the webmaster.
Optimize these three HTML elements to specifically target the new keyword that correlated with each product page.
For example; if an old product page has an H1 titled ‘Award-Winning, Champion Butter Toffee’ and the new keyword to target is ‘Butter Toffee’, replace the H1 to become ‘Butter Toffee | [company name] ’ .
Making these on-page optimizations based on your keywords will have a greater potential effect towards your organic search visibility. If you want to test these optimizations on a certain set of pages, split testing has become very popular for gaining further insight. I would highly recommend this article on the subject by Distilled Senior Consultant Tim Allen.
How does this help you or your business?
The tools from Ahrefs aid the keyword research process from beginning to end. Once you get the hang of the best practices above, time spent on keyword research will decrease and you will begin to make better, faster decisions using this research structure. People may say keyword research is a daunting task, but with the tools available presently it will feel less tedious and more about meaningful research. High-quality keyword research can be used to make some of the most impactful targeting changes that will improve your site and drive traffic in the long run.
In the two-and-a-bit years that I’ve been working in Digital marketing, I’ve been keen to understand the reasons why we make the decisions we do in digital marketing. There’s a wide variety of ways to approach a problem, and although many of them have value, there has to be a best way to make sure that the decision you make is the right one. I wanted to take my evidence-first way of thinking and apply it to this field.
In a previous life, I worked in clinical science, specifically in the field of medical physics. As part of this I was involved in planning and carrying out clinical trials, and came across the concept of a ‘hierarchy of evidence'. In clinical research, this refers to the different standards of evidence that can be used to support a claim – be that a new drug, piece of technology, surgical technique or any other intervention that is claimed to have a beneficial effect – and how they are ranked in terms of strength. There are many different formulations of this hierarchy, but a simple version can be seen here:
According to this ordering, a systematic review is the best type of evidence. This involves taking a look across all of the evidence provided in clinical trials, and negates the effects of cherry-picking – the practice of using only data that supports your claim, and ignoring negative or neutral results. With a systematic review we can be sure that all of the evidence available is being represented. A randomised controlled trial is a method of removing any extraneous factors from your test, and of making sure that the effect you’re measuring is only due to the intervention you’re making.
This is opposed to case-control reports, which involve looking at historical data of two populations (e.g. people who took one drug vs. another) and seeing what their outcomes were. This has its uses when it is not possible to carry out a proper trial, but it is vulnerable to correlations being misidentified as causation. For example, patients who were prescribed a certain course of treatment may happen to live in more affluent areas and therefore have hundreds of other factors causing them to have better outcomes (better education, nutrition, less other health problems etc.).
All of these types of tests should be viewed as more authoritative than the opinion of anyone, regardless of how experienced or qualified they are. Often bad practices and ideas are carried on without being re-examined for a long time, and the only way we can be sure that something works is to test it. I believe that this is also true in my new field.
A hierarchy of evidence for digital marketing
While working at Distilled, I’ve been thinking about how I can apply my evidence-focussed mindset to my new role in digital marketing. I came up with the idea for a hierarchy of evidence for digital marketing that could be applied across all areas. My version looks like this:
A few caveats before I start: this pyramid is by no means comprehensive – there are countless shades of grey between each level, and sometimes something that I’ve put near the bottom will be a better solution for your problem than something at the top.
I’ll start at the bottom and work my way up from worst to best standards of evidence.
Obviously, the weakest form of evidence you can use to base any decision on is no evidence at all. That’s what a hunch is – a feeling that may or may not be based on past experience, or just what ‘feels right’. But in my opinion as a cold-hearted scientist, evidence nearly always trumps feelings. Especially when it comes to making good decisions.
Having said that, anyone can fall into the trap of trusting hunches even when better evidence is available.
It’s easy to find brilliant advice on the ‘best practice’ for any given intervention in digital marketing. A lot of it is brilliant advice (for example DistilledU) but that does not mean that it is enough. No matter how good best practice advice is, it will never compare to evidence tailored to your specific situation and application. Best practice is applicable to everything, but perfect for nothing.
Best practice is nevertheless a good option when you don’t have the time or resources to perform thorough tests yourself, and it plays a very important role when deciding what direction to push tests in.
A common mistake in all walks of life is thinking that just because something worked once before, it will work all of the time. This is generally not true – the most important thing is always data, not anecdotes. It’s especially important not to assume that a method that worked once will work again in this field, as we know things are always changing, and every case is wildly different.
As with the above example of best practice advice, anecdotal evidence can be useful when it informs the experimentation you do in the future, but it should not be relied on on its own.
Uncontrolled/badly controlled tests
You’ve decided what intervention you want to make, you’ve enacted it and you’ve measured the results. This sounds like exactly the sort of thing you should be doing, doesn’t it? But you’ve forgotten one key thing – controls! You need something to compare against, to make sure that the changes you’re seeing after your intervention are not due to random chance, or some other change outside of your control that you haven’t accounted for. This is where you need to remember that correlation is not causation!
Almost as bad as not controlling at all is designing your experiment badly, such that your control is meaningless. For example, a sporting goods ecommerce site may make a change to half the pages on its site, and measure the effect on transactions. If the change is made on the ‘cricket’ category just before the cricket season starts, and is compared against the ‘football’ category, you might see a boost in sales for ‘cricket’ which is irrelevant to the changes you made. This is why, when possible, the pages that are changed should be selected randomly, to minimise the effect of biases.
Randomised controlled trials (A/B testing)
The gold standard for almost any field where it’s possible is a randomised controlled trial (RCT). This is true in medicine, and it’s definitely true in digital marketing as well, where they’re generally referred to as A/B tests. This does not mean that RCTs are without flaws, and it is important to set up your trial right to negate any biases that might creep in. It is also vital to understand the statistics involved here. My colleague Tom has written on this recently, and I highly recommend reading his blog post if you’re interested in the technical details.
A/B testing has been used extensively in CRO, paid media and email marketing for a long time, but it has the potential to be extremely valuable in almost any area you can think of. In the last couple of years, we’ve been putting this into practice with SEO, via our DistilledODN tool. It’s incredibly rewarding to walk the walk as well as talking the talk with respect to split testing, and being able to prove for certain that what we’re recommending is the right thing to do for a client.
Sign up to find out more about our new ODN platform, for a scientific approach to SEO.
The reality of testing
Even with a split test that has been set up perfectly, it is still possible to make mistakes. A test can only show you results for things you’re testing for: if you don’t come up with a good intervention to test, you won’t see incredible results. Also, it’s important not to read too much into your results. Once you’ve found something that works brilliantly in your test, don’t assume it will work forever, as things are bound to change soon. The only solution is to test as often as possible.
If you want to know more about the standards of evidence in clinical research and why they’re important, I highly recommend the book Bad Science by Ben Goldacre. If you have anything to add, please do weigh in below the line!
Read Full Article
Read for later
Articles marked as Favorite are saved for later viewing.
Scroll to Top
Separate tags by commas
To access this feature, please upgrade your account.