Google re-indexing

I just moved to REW from another company and my site was already ranking VERY well on certain pages and phrases. Those pages are still ranking, but I’m getting a 404 error on certain pages. How do I fix this? Is it possible to submit an XML to have Google re-index my site since the url structure is slightly different than before?

If you can, it is typically best to handle this pre-launch. Any high-value pages should be kept the same BUT if you cannot do that, what you do instead is a 301 (permanently moved) redirect.

You can find the redirect tool here: /backend/cms/tools/rewrites/

This allows consumers if they find that link to be directed to the right place AND it tells Google to redirect pagerank, link authority to the page as well.

404 is natural (Google knows how to figure it out) as long as you have links to your new pages it will eventually crawl your new content. However, yes you can submit a sitemap to Google and/or for your most important pages also go directly to Google search console and request indexing.

Note, if you have changed pages and did not redirect immediately, there can be a period of time where Google needs to sort out what they think is going on with your site, so it can take a bit to normalize.

The best pro-active advice if you are in this situation, is to focus on building new external links to your new pages which will help Google form an opinion of their value faster

The pages I’m referring to are typically neighborhood pages. My old structure was domain.com/neighborhoods/neighborhood-name. Now its just domain.com/neighborhood-name. Is there a way on the backend when I’m creating the community pages to have it to it’s the same structure?

Where can I get my sitemap.xml file to resubmit to google? When I go to my domain.com/sitemap.xml it shows me like 10 different sitemap options, sitemap_0.xml, sitemap_1.xml, etc.

That’s a good question - @shazmin can we support /category/sub-cat/ structure? I thought we could, but I don’t know how to do it. Maybe make “Neighborhoods” a main page, and then the name a sub page?

Please have someone reply with detailed answer. thanks

@AndyM ,

We implement the multiple sitemap file approach. The sitemap.xml file is the main index and the only one you need to submit. The other files _0.xml _1.xml etc are the sub files, since the specification says no xml file can contain more than 50,000 urls and since we include idx detail pages in our generated sitemaps this multiple file approach is necessary.

The sitemap.xml files in our REW CRM are re-generated every 24 hours and submitted to google automatically although you can submit it manually in your Google Search Console.

Yes! All you need to do is add a forward slash in the Filename/Alias field (see right sidebar), where you want to fake the directory structure. For example: neighborhoods/neighborhood-name.

1 Like

I re-indexed my site just to be safe and got this message. Any idea why or what I can do to fix it?

“ Search Console has identified that your site is affected by 1 Page indexing issue(s). The following issues were found on your site.

Top Issues
• Blocked by robots.txt

We recommend that you fix these issues when
possible to enable the best experience and coverage
in Google Search.”

You should not get that error as long as you’re submitting pages meant to be crawled.

REW using the robots.txt file to block folders we don’t want indexed (or attempted to be indexed)

These are the areas we block/don’t want to be indexed.

Which page did you try to submit?

I submitted my site map address, www.mandelwillsell.com/sitemap.xml

I don’t think this due to the sitemap.xml. The site map helps Google crawl your site but it still actually crawls the site naturally from there. In this case, which I have seen before in search console, the googlebot is likely finding urls like listing/*/brochure through natural means and then gets conflicted because the robots.txt says not to crawl it. It then reports this in google search console as warnings or errors.

One way we can fix this is to transition to using strictly the Robots meta tag in the html output or http response headers themselves and then remove the appropriate line in the robots.txt file. This allows googlebot to crawl the page, see the nofollow,noindex and then adjust its known page inventory.

I don’t recommend this though, since robots.txt is not just for Google, it’s for bad actors (not that they always listen) -

At the end of the day, Google saying they can’t index something is not always a bad thing - if they say they can’t index what you don’t want them to, they are just confirming your robots.txt file is working :slight_smile: