7.7 Federated Searches

The growing use of federated searches and the spread of web crawler robots has the potential to inflate usage statistics, so COUNTER requires report providers to identify this type of usage in COUNTER Reports.

Search activity generated by federated search engines MUST be categorized separately from searches conducted by users on the host platform.

Any searches generated from a federated search system MUST be included in the separate Searches_Federated counts within Database Reports and MUST NOT be included in the Searches_Regular or Searches_Automated counts.

The most common ways to recognize federated and automated search activity are as follows:

  • A federated search engine may be using its own dedicated IP address, which can be identified and used to separate out the activity.

  • If the standard HTML interface is being used (e.g. for screen scraping), the browser ID within the web log files can be used to identify the activity as coming from a federated search.

  • All searches via APIs and Z39.50 activity must be counted as Searches_Federated, as the results are not presented in the platform user interface.

  • If an API gateway is available, set up an instance of the gateway that is for the exclusive use of federated search tools. It is recommended you also require the federated search engine to include an identifying parameter when making requests to the gateway.

  • For Z39.50 activity, authentication is usually through a username/password combination. Create a unique username/password that just the federated search engine will use.

Where federated or automated usage is genuine user-driven usage, in the context of Text & Data Mining, Access_Method=TDM should be used. This allows users of the resultant reports to distinguish automated usage from more traditional (Access_Method=Regular) usage.

COUNTER has lists of federated search tools in Appendix F, separate from a list of robots which is reviewed and updated on a regular basis and which can be found at COUNTER Robots.