Strategically organizing and structuring your website into categories, subcategories, and product pages will benefit both your visitors and the search engines.
When your visitors can find what they need quickly and easily, they’re more likely to buy. When search engines can find what they need quickly and easily, they’re more likely to rank your site high and reward you with traffic.
The question becomes: how do you balance both?
In this article, I will teach you why information architecture and site structure matter in ecommerce, how to do basic keyword research to determine the best structure for your store, the basics of URLs and internal linking for SEO, and how to deal with filtering and faceting to control the potential for duplicate content.
This advice comes from my ten years of SEO experience, the last six spent helping large websites with complicated structures drive more traffic.
Table of Contents
- Why site structure matters
- Keyword research for site structure
- Keyword research tools
- Organizing keywords into rank tracking buckets
- Basics of internal linking
- Avoiding the SEO pitfall of duplicate content
- Next steps
Why site structure matters
Your website’s organization and information architecture are important not just for search engine optimization (SEO), but more importantly for conversions and cart size. If you can help your visitors find what they need more easily, then you have a better chance of converting them into customers.
Search engines want to rank the best results for a given query. If you search [dog food], would you prefer to land on a single page that only lists dog food or a page listing dog, cat, bird, and mouse food?
My guess is the former, which is why you see sites like Chewy.com, Petco.com, PetSmart, and Petflow all ranking with pages that only list dog food:
Ranking is much more complicated than this of course, as you have to take links, site authority, and much more into account to get a full picture of what it takes to rank.
Because the search engines want to serve up the best result for a given query, they prioritize pages that target that specific term. Realistically, you cannot expect the page that ranks for [pet food] to also rank for [dog food] or a page that ranks well for [dog food] to rank as well for [Purina dog food] since the word “Purina” shows that the results on the page only cover one brand.
User intent aside, website architecture plays a critical role in determining how well you’ll rank because search engines partially base their rankings off of how well you link and organize your important pages. You don’t need to go as far as calculating internal pagerank (though you can, if you want) to know that you should link to your pages that target high volume and high competition terms from as close to your homepage and other highly linked-to pages as possible.
A clean website architecture looks like this (more information here):
Starting from your homepage, visitors should be able to navigate down to categories, then subcategories, and then products. Every product should be in one category (for example [pet type] food) and can then be in multiple subcategories (e.g. [brand] dog food, [brand] cat food, etc.)
Using this simple example, each category page receives 25% of the equity passed from the homepage. Then each subcategory receives a percentage of that category’s authority, and so on down.
The more competitive a search query you are targeting, the more internal (and external) links that page needs.
Products are purposefully organized below categories and subcategories because they receive less traffic than the head terms targeted by your categories. Individual products are also much easier to rank for because they have less competition in the search results, thus they do not need to be as close to your homepage. They simply do not need as many internal links to rank.
Keyword research for site structure
Keyword research is the foundation of SEO. If you want to drive traffic to your website from search, then you need to understand which terms people are searching for, how hard those terms are to rank for, how many potential visitors you could receive from those terms based on your page’s ranking, and the potential revenue those terms could bring in (based on conversion rates and average cart value).
Of course, you don’t have all of this information from the start, especially if you have not built or optimized an ecommerce website before. But you can optimize towards traffic and, from there, conversions based on the products you have on your site.
Keyword research takes time, but building your site correctly from the start reduces your time to rank and saves you the trouble of redoing URLs, fixing internal links, and implementing redirects. In short, doing the work upfront will pay you back in the long-term.
When you start keyword research, you have data sources available to you. You have:
- Google Search Console, which you should already have installed on your site.
- Competitor data in a tool like SEMrush or Moz.
- Autosuggestions from Google using a tool like Soovle.
- Any existing AdWords data you have from ads you are running.
What you’re looking for are buckets of keywords, which you can use to segment your products.
You can take a head keyword that you identify, such as [dog food], and plug that into a tool like SEMrush. They then show you related keywords, like so:
By looking through the suggested terms based off the seed keyword of [dog food], I can tell that if I were creating a pet food store, then I should create categories based on:
- Animal type
- Price (e.g. cheap, sales, discount)
- Type of dog (e.g. puppy, old dog)
- Color (e.g. Purina dog food yellow bag)
This information lets you create the strategy for labeling your products and it bases your top level pages off of search volume. For [dog food], I’d organize everything by brand first, then treat price and type of dog as equal levels.
Your website’s information architecture will then be based off of what terms people search for most often. The top level will be your highest volume keyword, then subcategories that become more specific and have smaller keyword volume. Finally, you have your products, which likely have less search volume than a brand or a type.
Here’s a potential dog food category -> subcategory -> products taxonomy. Your site architecture would look like this:
And your URLs would be organized like:
- Homepage - https://domain.com
- Dog food - https://domain.com/dog-food
- Pedigree dog food - https://domain.com/dog-food/pedigree
- Pedigree yellow bag - https://domain.com/dog-food/pedigree/product-name (with product name and color in the title)
- Brand - https://domain.com/pedigree
It’s important that you already have [Pedigree dog food] covered with /dog-food/pedigree, so you do not also want /pedigree/dog-food underneath the Pedigree brand subfolder. However, having Pedigree (or any brand) with its own page will let you optimize for terms like “Pedigree pet food” and it is also a great way to get all of your products under that brand indexed close to the homepage.
💡 Note: While there are parts of the url that can’t be changed in Shopify, you can change the handle of the page or product that you’re working on. There’s more information on how to do so right here.
Keyword research tools
The right tool depends on your needs and how often you will use it. There are a few tools that I use and recommend for keyword research.
- Moz’s Keyword Explorer. This is probably the best tool for you. A Moz Pro subscription costs $99 a month and gives you access to all of their tools, including: campaigns, keyword tracking, competitor tracking, and more.
- SEMrush. Starts at $99 a month. It’s ideal for larger sites.
- Soovle for autosuggestions to find longer tail keywords under your head terms.
- Answer The Public to identify informational queries your audience is already searching.
A free tool you can install is the Chrome extension Keywords Everywhere, which puts search volume by the terms you search in Google as well as the keywords appearing in your Google Search Console Search Analytics report.
Organizing keywords into rank tracking buckets
Now that you have the keywords you want to rank for and have built the pages to target them, you need to track your keywords so that you can see how they are trending over time.
I recommend that you track sets of keywords more closely than you track individual keywords. If you track a set of 50 keywords, you can see more holistically how well you are doing for search than if you just track 1 or 2, as those 1 or 2 may rank well when the other 48 categories you care about are neglected. Simply tracking a few major keywords will focus you on those keywords instead of your whole site. When you focus on sets of keywords across your site, you can drive exponentially more traffic.
Using SEMrush, you can tag each of your keywords so that you can see how your different buckets of keywords are ranking over time. You may identify that your brands are ranking well, but your products are not. From this knowledge, you can adjust your strategy to improve your product rankings.
Here is an example of a set of keywords in one of my SEMrush campaigns, which shows the estimated percentage of clicks I will receive for those keywords:
This is, in my opinion, a much better way to track SEO success than individual keyword rankings. I do track individual keywords that can drive outsized conversions, but if you are looking to gauge overall SEO performance, then you should really care about share of voice across your keyword sets (like dog food).
Basics of internal linking
Organizing your information architecture in a way that meets your users’ needs and takes search engine crawlers from the most important to least important pages accomplishes a few tasks:
- Category and subcategory pages become strong because they are linked from other highly-linked pages (like your homepage), which allows you to rank for more competitive search terms.
- All of your pages are brought higher in the architecture to get them indexed and ranked.
- Website visitors and customers can more quickly navigate to their goal without having to rely on site search or a crowded top navigation.
There are a few ways to make sure that your most important pages are linked to from as many relevant pages as possible.
First, your most competitive keywords should be one level off your homepage. In our pet food example, this will be your dog food, cat food, and brand pages. Second, use your top navigation to link to your most important pages using relevant anchor text. One strategy a lot of ecommerce websites use is a jumbo top navigation:
And third, you should use breadcrumb navigation links to link from product pages back up to, at minimum, their top level category (and Shopify allows you to do this). If it makes sense, then link as well to the product’s main subcategory. Here’s how Petco does it:
I would actually extend this and also link to “Dry Dog Food” in this breadcrumb, then list the product name (or a shortened name) as the end of the breadcrumb trail. Doing this would help Petco rank better for [dry dog food], and done at scale, this could really help them rank better in their longer tail of keywords because more internal links from more strong pages correlates strongly to better rankings.
Avoiding the SEO pitfall of duplicate content
If your website has consistent URLs with hyphens, www or non-www, and it’s secured with an SSL certificate, you’ve already built yourself a good base upon which to control duplicate content. For a primer on why duplicate content is harmful, check out this resource from Moz.
The most common issue I see with duplicate content on ecommerce sites is confusing your filters and facets, and not controlling them either via canonical tags or Search Console.
First, let’s define the two terms.
When I talk about a facet, I mean a narrowing of a category into a subcategory that still has search volume. If [dog food] is a category, then [dry dog food] is a facet. Here’s how Petco does it for their categories, and we know that because they link with an <a href> link to a new URL:
When I talk about a filter, I mean a narrowing that is useful for users, but where there is no search volume. If you look at Petco’s [dog food] page, they also have a lot of filters on the side that do not link to a new URL:
There are many different ways to control indexation by using filters, including:
- Using a hash (for example, /dog-food#under10k) in the URL as content behind a hash in a URL is not seen by search engines.
- Using a parameter in combination with a canonical back to the URL you want indexed (for example, /dog-food?price=under10k with a canonical back to /dog-food). This way, you do not have to redo your filtering technology, but you also essentially tell the search engines that “this ?price=under10k is a subset of /dog-food that should not be indexed because no one is searching for it”. The rel-canonical tag can be a good way to control duplicate content.
- Controlling the parameter in Google Search Console and Bing Webmaster Tools by marking it as “Yes Changes Content”, “Narrows” to say that the content gets more specific, and “No URLs” so that Google does not crawl them.
Less effective ways to control duplicate content created by filters, depending on the implementation, are to do it:
- Through robots.txt Disallow. This will work, but will likely cause bigger issues than it solves. Robots.txt is a sledgehammer, not a precision tool.
- Through meta robots noindex. This will keep the URLs from appearing in the search index, but will not keep the search engines from crawling them.
To quickly explain the two above, every website has a robots.txt file, which essentially is a set of directives that the search engines have agreed to follow. Shopify’s can be found at https://www.shopify.com/robots.txt, for example. The robots.txt is a way to tell search engines not to access certain parts of your site, especially areas like staging subdomains or logged in pages that may contain personal information.
The meta robots tag is a simple meta tag, formatted like <meta name=”robots” content=””>, where the content area can either be blank (and defaults to index, meaning the search engines should store the URL in their index) or noindex, which tells the search engines that you do not want that URL to be in their index.
A difference between a robots.txt block and a meta noindex directive is that search engines will still crawl pages with the meta noindex. The noindex solves the issue of removing a page you do not want in the index from the index, but search engines can still crawl it instead of other pages on your site that you do want to rank.
As you can tell, duplicate content takes specific strategies to control so that filters do not negatively affect your rankings, traffic, and revenue!
You've hopefully gained a much deeper understanding of website information architecture and why it matters so much to SEO! You also now understand how to get started with keyword research and identifying the keywords that matter to your business. Plus, how to structure your URLs, categories, and facets and filters to avoid duplicate content and drive more traffic than before.
- Use a crawl tool like Moz or Screaming Frog to first get a full view of your existing website. If you do not have an existing website, you can skip this step.
- Do keyword research to discover the main categories and subcategories that your products fit underneath. There is no “correct” or “optimal” number of categories or subcategories; you need the amount that accurately describes your products. Don’t go crazy with subcategories that only contain one product as these may be seen as low quality by search engines and will not rank, but do not be afraid to go to the longer tail (e.g. keyword terms like “Pedigree wet dog food pouches”) to rank.
- Use the keyword volume data given to you and determine which terms should be facets (e.g. categories or subcategories) and which should be filters (e.g. shoe size or dog food price). Shopify handles filters well out of the box, with canonicals back to the category/subcategory page.
Through a combination of top navigation links, sidebar links to subcategories on category pages, and breadcrumbs from products back up to their subcategories and categories, you can have all of your important pages indexed and given the chance to rank.