QUOTE (gumball @ Feb 11 2009, 01:01 PM)

1. We are on 7.4 and have a custom HTML page for our home page.
Entering
http://www.EXAMPLE.com or
https://www.EXAMPLE.com, or
http://EXAMPLE.com on the address line serves up our home page but according to the search engines 3 pages exist with the same content which can get us banned for spam.
We spoke with the top "SEO guy" at NetSol and he was able to redirect the
https://www.EXAMPLE.com to
http://www.EXAMPLE.com using a 200 redirect (we are waiting to hear back about
http://EXAMPLE.com)... but with a 200 on both pages, the search engine wants to index both the start page and the target page - this also is a known spam method.
Our closest competitor is on the Yahoo platform and trying this for their site yields "Page not found" for all but
http://www.EXAMPLE.com. We think this or a 301 would be a preferred method and would like to see a solution.
Unless NS can fix this, we suggest you avoid using the custom HTML for your home page (post date 02/10/09)
2. When you load an item and visit your shopping cart, assuming you have SSL, you are in secure mode (HTTPS). If you decide to go back to the main site to add another product, you transition from secure (HTTPS ) to non-secure (HTTP). This transition uses temporary 302 redirect and is also tells the SE's that there are 2 pages with the same content (spam trigger). What is needed is a 301 permanently moved redirect.
We called NS support about this numerous times and were told that there was not fix for the Custom HTML page and that #2 is a non-issues because the SE's are prevented from crawling cart.aspx due to robots.txt.
We are not comfortable with these explanations and would like to see a fix. Posting this today to ensure that NS support/development teams are aware.
Hello gumball,
When you place pages on the site outside of the cart (ex. index.html page), you step outside security and SEO blanket in which the cart was designed.
1.) The first issue you see with your homepage and variations of https and www could actually extend beyond what you've listed. It can include the trailing /index.html as well giving you 6 different ways to have your homepage indexed. This issue is commonly referred to as canonicalization. The cart by design eliminates 4 variations to minimize potential impacts. We allow for 2 variations to ensure our customers still have the flexibility the require. It is up to the end user to use each variation as appropriate.
To appease your concerns, the search engines won't treat this as "spam" but rather as a canonicalization issue. You won't get banned for canonicalization issue of this type. It may diminish your ability to effectively rank for the right page as it causes confusion in the search engines, but it won't get you banned.
If you choose to keep using the custom index.html page you can create a Webmaster account at Google and dictate what version of your homepage you want to be your homepage. How To from Google:
http://googlewebmastercentral.blogspot.com...red-domain.html.
The only real viable option you have to prevent:
http://example.com form occurring is to ensure that every link going to that page includes the www. However you won't have control over anyone linking to your website without the www.
The preferred method to address all of your issues it to utilize the homepage included in the cart (index.aspx) and utilize the built in CSS flexibility to hide the portions of the site you don't want showing up. By utilizing the "display: none;" attribute on the homepage you can effectively hide any portion of the homepage you don't want displayed, therefore float anything you want there (your index.html content).
2.) Depending how you have your cart configured. Once you add a product you can reload the page, reload the page with an alert, go to cart details, go to checkout, or go to the homepage. Depending on this setting you may or may not enter an https area of the site.
As you mentioned, explicitly instruct search engine spiders to not crawl, index or follow any pages that would have you enter an https area via the robots.txt file. As a second level of instruction we've also added to any https page instructs the spiders to not index the page.
While the cart may be utilizing a 302 to transition a user from a https version to a http version of your site, search engine spiders have been instructed to not even get to that page, therefore preventing the issue.
It sounds like much of your pain is coming from using the index.html page for your homepage, therefore preventing the cart from blocking all of the pages you don't want indexed. I'd strongly suggest utilizing the index.aspx page with the "display: none" and JavaScript option for the containers you don't want to see displayed. It will reduce if not eliminate all the issues you listed.
Let me know if we can further assist.
-Aaron Eversgerd