Dynamic Sitemap in Flask
Posted on
by Kevin FoongSitemaps are handy to let search engines know which pages they should crawl on your website. There is a way to automatically create a sitemap.xml route for your Flask website which will always reflect the latest pages on your site. That is everytime you add a route, it will automatically reflect in your sitemap.xml
You can see this site's dynamic sitemap at https://www.kevin7.net/sitemap.xml
The full code is shown below. The key points are:
- Create a route as per normal and name the route "sitemap.xml"
- To get all the static routes just read in the app's "url_map" method via the "iter_rules" method and include any route that is a HTTP "GET" method.
- For the static routes you can just use an arbitary sitemap last modified date to be 10 days prior to today.
- To get all the dynamic routes (blog postings, etc) you will need to connect to your database, read in each blog posting, and manually build your routes. If your blog post has a timestamp in the database, then use that for the sitemap last modified date.
- Finally render sitemap.xml via a template. Make sure you specify the header to be "application/xml" or else it won't render correctly.
On the Google side, the first time you are doing this, you will need to submit your sitemap to the Google Search Console. But thereafter, everytime you want Google to crawl your site, perhaps because you have added some new content, all you need to do is to send a ping like so.
http://www.google.com/ping?sitemap=https://example.com/sitemap.xml
Google's documentation can be found here
One advantage of having a sitemap is that if there were any issues with your links, Google will actually send you an email informing you! They will probably not do this if your web pages were found via crawling. See below.
The full source code is shown below.
@bp.route('/sitemap.xml', methods=['GET'])
def sitemap():
pages = []
# get static routes
# use arbitary 10 days ago as last modified date
lastmod = datetime.now() - timedelta(days=10)
lastmod = lastmod.strftime('%Y-%m-%d')
for rule in current_app.url_map.iter_rules():
# omit auth and admin routes and if route has parameters. Only include if route has GET method
if 'GET' in rule.methods and len(rule.arguments) == 0 \
and not rule.rule.startswith('/admin') \
and not rule.rule.startswith('/auth') \
and not rule.rule.startswith('/test'):
pages.append(['https://www.kevin7.net' + rule.rule, lastmod])
# get dynamic routes
posts = Post.query.filter(Post.current.is_(True)).all()
for post in posts:
url = 'https://www.kevin7.net' + url_for('main.post_detail', slug=post.slug)
last_updated = post.update_date.strftime('%Y-%m-%d')
pages.append([url, last_updated])
sitemap_template = render_template('sitemap/sitemap_template.xml', pages=pages)
response = make_response(sitemap_template)
response.headers['Content-Type'] = 'application/xml'
return response