On of the most important features, WordPress has is permalinks rewrite engine that can create all those pretty links we are so used to with no query elements like question marks or ampersand. Did you wonder how permalinks work and what can you do to customize them? Read on.
This article explains what happens when WordPress resolves URL request that uses pretty permalinks. This article doesn’t include examples for creating custom permalinks structures.
By default, pretty permalinks were set to disable in WordPress. But, in WordPress 4.1, permalinks are set to enabled on new installations, if it was possible to activate them. For pretty permalinks to work, WordPress needs access to the .htaccess file (if on Apache server) where it needs to add few lines that are the base for the rewrite engine in WordPress. And that requires Apache mod_rewrite module. This module is part of Apache installations and it is always active (or at least it should be). Code added into .htaccess file basically redirects all URL’s to WordPress index.php and that allows WordPress to break URL into parts and using regular expressions detect what content to display.
WordPress supports also PATHINFO permalinks that don’t require the mod_rewrite module, but permalinks, in that case, must start with index.php. More info on this you can find on the WordPress Codex Permalinks page. There are some setup differences in using permalinks for Apache, IIS or other servers, but they can work with any of the currently available servers for Windows, Linux or MacOS.
So, basic thing to activate rewrite engine in WordPress is to enable permalinks from WordPress Settings/Permalinks panel. If it is set to Default, permalinks are disabled, any other value they are active.
Why do you need these permalinks?
Well, there are many reasons, and most important reasons are SEO (not that important, since search engines handle query based URL’s just fine) and that they look much nicer and user-friendly than long query strings in URL. Especially if you have complex website structure. Search Engine Optimizations benefits from permalinks because they contain more relevant information about the content, and most search engines have a use for that when indexing websites.
Older versions of WordPress were known to work slower because of some permalinks structures that required additional SQL queries, but since WordPress 3.3 most critical permalinks structures are optimized and there is no performance penalty if you use post name only in the URL. Even with older WordPress versions performance was affected by only some structures and with a large number of posts.
Permalinks are a must in a website, and very few WordPress websites don’t use it (mostly default installations, left with permalinks disabled). What to choose for permalinks is another matter.
Default permalinks settings
To quick start using the permalinks, you can select one of the predefined rules from Permalinks panel. Also, you can see Custom Structure field where you can create the structure you want. What this panel doesn’t say is that those rules are only for posts (default post type: post). WordPress allows you to customize only this post type rewrite rules, nothing else! Pages in WordPress always use same rewrite structure: only sanitized version of the page name. And, that is the rule you can’t change. For archives, Permalinks panel offers only two settings: what to use as a base for category and tags archives.
Beside this, there are plenty more rewrite rules that WordPress will not allow you to change (well, not directly anyway). These rules include archives for authors, date base archives, attachments, feeds, custom post types, custom taxonomies, and generic archives.
When you create rules on the Permalinks panel, you don’t need to handle regular expressions, but you can use special tags to form the URL. By default, WordPress uses several structure tags (full list in the WordPress Codex). If you use these tags in the URL they will be replaced with actual data. If the element in the URL is not recognized, it will be left like that as a part of URL. This way you can add static parts of the URL.
Most things in WordPress use more than one rewrite rule. To resolve year based archive, WP needs separate rules to match: basic year archive URL, URL with page numbers and feed URL. Some things need more than 3 rules, for a post you need 5 rules.
Resolving rewrite rules
To resolve URL, WordPress uses a list of rewrite rules. Rules are based on regular expressions, and each regular expression in the list points to a query based URL. For instance, here is the rule to resolve author archive feed:
Regular Expression: author/([^/]+)/feed/(feed|rdf|rss|rss2|atom)/?$
WP Resolved Query: index.php?author_name=$matches&feed=$matches
I am not going to go into how regular expressions work, but URL is matched against all regular expressions in the rules list until we get a match. When URL is matched against expression, query for that expression is then used to resolve requested URL to a query based URL. In the query in the example above you see $matches and $matches. These are values detected by regular expression from requested URL. These values are URL parts marked with ‘(‘ and ‘)’ in the regular expression. So, if your URL is this:
Website URL part is removed, and we get the request:
This is what WordPress than matches against the rewrite rules, it will match it against our example expression above. Words ‘admin’ and ‘rss’ are matched/extracted from the regular expression, they are replacing $matches and $matches and resolved query based URL is now this:
And this is something WordPress can use to prepare page, load template, and data for it. This final query is used to create WP Query object that is ultimately used to get posts for that request.
If all rewrite rules fail, the last rule will always resolve. If the URL is resolved by some rewrite rule, and the WordPress Query doesn’t find any results to match, again you will get 404 error. Depending on your permalinks settings, this base rule resolves all unmatched requests to a page or category. If that page or category is not found, you get 404 error.
Handling custom rewrite rules is not easy, and very often you will run into conflicts. If you try to use the same regular expression for your rule, you will remove default rule that was using the same expression, and that will most likely break something. When working with rules, it is useful to have the list of all rules in the system.
WordPress rewriter is very powerful, but by default, it uses only a fraction of what it can do, since only basic rules are implemented, and if you need more, you must write code for that. The goal was to keep things simple in the core and to allow freedom to customize things if needed. Not everyone will need so complex rewrite rules, and basic ones are enough. If you need more, you need to dig deeper or to use specialized plugins to achieve that.
8 thoughts on “How WordPress URL rewriting works?”
Can you tell me how to actually achieve these type of URL structures?
Is this done on the apache level or php can handle this?
I would like to have a similar to wordpress structure as you show with /name-of-thing/name-of-post/
I know this can be found on the open source code but if possible to help me find where it is among the large code base that would be great!
Great work and thank you in advance.
I will soon start series of posts on this subject with practical examples for this. But, this is done using WordPress rewriting engine, and it can’t be used for other CMS systems.
FWIW, Google itself has already stated, almost four years ago, that it has no problems crawling URLs with question marks and ampersands — and actually goes so far as to specifically recommend _not_ using URL rewriting for dynamic websites (like those created with WordPress), as doing so can cause problems when crawling sites:
Pretty URLs _are_ more visually appealing and easier for visitors to understand, but it would be best not to perpetuate the myth that it’s better for SEO or hard for search engines to crawl such URLs.
Really useful and very well explained! thanks a lot!
Interesting post. Curious question? Can I use this to forward this permalink with accented characters:
to it’s correspondent permalink in WordPress without the accents:
No, these are not the same characters. It could be done with additional filtering in rewriter to replace characters, but it will not work by default.
I see. Thanks a million! That was fast!
Thank you thank you!
redirect_canonical is evil… it was messing up my attempts to parse the slug myself. The one line of code you give above saved me HOURS of work!!