Other Articles On Visitor Traffic and Reducing & Analytic Rules:
It is important to understand what contributes to traffic bandwidth usage on your site to manage it effectively.
The primary traffic sources contributing to bandwidth usage on your website are:
- Traffic from spiders/bots for indexing
- Traffic from visitors
- Optionally, web service traffic from integrations via our Web Service API
- Optionally, large videos streamed from your website
Traffic from spiders/bots:
Your website is crawled by spiders and bots to get your site indexed for search engines, comparison engines and much more. However, spiders/bots bandwidth usage can add up quickly, leading to OnDemand Licensing Fees after your monthly reserve has been used. The key to manage your traffic from spiders is to review which spiders/bots visit your site and to block/disallow those that are not beneficial to your business. Example: If you don't sell internationally, it may not be beneficial for your site to be crawled by international search engine bots. Some international crawlers can take up very significant amounts of bandwidth, sometimes bordering on abuse.
To restrict such spiders from indexing your site and using your monthly reserve you can use the Visitor Traffic & Analytic Rules feature in AmeriCommerce under Global Settings>Workflow Rule Engine>Visitor Traffic & Analytics
Spiders can be of known or unknown origin.
Known spiders are spiders that have already been identified by AmeriCommerce and named. Identifying and naming spiders is ongoing at AmeriCommerce and every spider that we identify is named and added to your reports.
Below is an example of how to identify and block a known international spider.
Step 1: Select the Top Spiders/Crawlers Report and Identify International Spiders to block
Step 2: Setup a Rule under Visitor Traffic & Analytics to block the identified International Spiders
There are two parts to this step:
- Identify the User Agent
- Setup a rule for the User Agent
Browse to Tools > 3 Power Features > Rule Engine > Visitor Traffic & Analytics
To identify the User Agent Name find the admin rule for it:
- Use the arrows to browse the pages to find the spider in the list and identify the User Agent
- Find the Spider you wish to block and click on the edit icon beside it
- Make note of the User Agent. (This will be used in the rule for blocking)
To setup the rule to block the spider:
Select User Rules instead of Admin Rules, then click on orange New button in upper right.
Setup a Rule to Block the Spider:
- Enter a Rule Name. This can be any name of your choice and mark as 'Active'
- Select Condition Type as User Agent and the Condition as the name of the spider; Click the '+' button to add the condition
- Select the Action as 'BlockUserRequest'; Click the '+' button to add the Action
- Click Save
After this is setup, if the international spider visits your site, the condition will be met and the Action of blocking the user request will be triggered. In effect, the spider visit will be blocked/disallowed and reduce unnecessary bandwidth usage. The spider will be shown a typical blocked request message that is customizeable via the Store Text & Languages module in AmeriCommerce.
Additionally, you can redirect the spider to a page of your choice as well.
You can repeat the rule setup step for all the international spiders that you need to block.
While we continue to identify spiders and name them, this is an ongoing process and there will always be unknown spiders indexing your site. The Visitor Analysis and Top Visits by User Agent reports provides detailed information on User Agents. You can review them to identify visits that are using up bandwidth and then use the Visitor Traffic and Analytics Rules feature to disallow them. Please be wary of making rules too simplified or blocking too much traffic as you could easily block beneficial traffic or every visitor to your site if making rules improperly.
This site helps identify bots/crawlers via IP or User Agent by using the search tool on this site: http://www.botsvsbrowsers.com/
Traffic from Visitors to your site:
Traffic from visitors to your site is good for business and should not be blocked. Blocking visitors from accessing the site is helpful only in rare instances if you are trying to block a fraudulent user or trying to block your site from untargetted traffic. It is not recommended that visitor traffic be blocked unless you are certain that a visitor needs to be blocked or visitors from a specific link/referring domain should be blocked.
In some cases hackers may mimic standard user agent strings to appear like a normal browser, still be wary of abnormally high numbers and add IP Address or HostName to your report columns to see if one IP or Host is abusing you, and you can block these specific items as well.