Handling Meta Tags in Client-side React App
November 20, 2019
It’s not of much use if you are using the client-side rendered app and want to add open graph tags for facebook or tags for twitter etc.
If you have ever created a single page app, you know that you can not have different meta tags for titles, descriptions etc. for all the pages. Because…, you guessed it right, it’s a single-page app and therefore there is only one index.html file and every page picks up their meta tags from this file. So, it’s not possible to just hardcode the meta tags or open graph tags into your index.html file. I mean, you can do it, if you want same meta tags for every page on your website, but that’s not ideal for most use cases.
So, in my testing, I realized that Google’s crawlers are smart enough to wait for 10 seconds to render the page and read the tags and other data after that time. So, that wasn’t an issue for us. We just used React Helmet to inject the tags dynamically at runtime. React helmet helps to insert the head tag dynamically into every page in your application.
This was a major problem for us because at CleverX, one of the main features that our experts have is the ability to share their public profile on social platforms and use our dashboard to track the hits on their profile and also the source of this traffic. So to be able to maximize the no. of clicks on the links, it needs to look like this ⬇️
It’s personalized, it helps get attention and results in more clicks on the link.
So, the first thing which came to our mind was to find a list the Facebook, Twitter and LinkedIn crawlers and redirect them to a different server which would insert the meta tags and serve them a different template. We tried this, it worked, but it wasn’t ideal, you will see that below.
Facebook mentions their crawler’s user agents, which are
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) facebookexternalhit/1.1
Twitter is also kind enough to mention theirs
LinkedIn caused so many problems with this approach, first of all, I couldn’t find any official mention of the user agents that their crawlers use. Second, when we were able to figure out their crawlers’ user agents by finding some unofficial sources and by looking at our logs, the URL wouldn’t render properly 100% of the times. It was very inconsistent for some reasons.
The problem with this approach is that what we were doing is known as “cloaking”, which is something that search engines(read Google) other social web giants don’t like. Because it’s a technique used by hackers and spammers to fool the crawlers and then scam people. But now the search engines crawlers are much smarter, they can detect cloaking and cloaking will cause your rank to go down on Google and even your URL getting blacklisted on some sites. That’s not what we wanted, so we immediately
Also, the problem with targeting user agents was that they could get changed at any time without any information and it would break a lot of things for us, so we needed a more permanent solution. So we moved on to next possible solution.
So the next option was to instead directly serving the index.html using Nginx, we decided to serve it using Node and Express. So now instead of hardcoding the meta tags, this is how our index.html looked like
Now every request, whether it was a user or a crawler, was going through our Node and Express server and these variables were getting replaced dynamically at runtime by the appropriate value from a config file. Simple, right?
So we tested this configuration on our test server, we had to make a lot of changes to our Nginx config, because a lot of APIs stopped working, especially the ones which had query parameters. But that was an easy fix.
We tested this config on all social media sites and it worked perfectly and the best part was we weren’t cloaking anymore and this solution was easily scalable for new public URLs on our website.