
Many times when writing syndication or ad tracking scripts it is important to track what search bots are clicking through your links, as opposed to an actual users; thus allowing you to better track actual user interaction over bot interaction. Here is a script we wrote to accomplish the function.
First we compile an array of all the known search bots. (Although this is not comprehensive, it does cover the main web crawlers – and we will update it as we see the need.)
//********************* || LIST OF KNOWN SEARCH BOTS ||
function GetBotList(){
$BotList = array("Teoma", "alexa", "froogle", "Gigabot", "inktomi", "looksmart", "URL_Spider_SQL", "Firefly", "NationalDirectory", "Ask Jeeves", "TECNOSEEK", "InfoSeek", "WebFindBot", "girafabot", "crawler", "www.galaxy.com", "Googlebot", "Googlebot/2.1", "Google Webmaster", "Scooter", "Scooter", "Slurp", "msnbot", "appie", "FAST", "WebBug", "Spade", "ZyBorg", "rabaz", "Baiduspider", "Feedfetcher-Google", "TechnoratiSnoop", "Rankivabot", "Mediapartners-Google", "Sogou web spider", "WebAlta Crawler", "MJ12bot");
return $BotList;
}
Next, we scan through the search bot array and match it up against out $_SERVER['HTTP_USER_AGENT'] variable.
//********************* || SEARCH BOT DETECTION FUNCTION ||
function DetectBot(){
$BotList = GetBotList();
foreach($BotList as $bot) {
if(ereg($bot, $_SERVER['HTTP_USER_AGENT'])) {
$thebot = "BOT: " . $bot;
return $thebot;
}
}
}
Finally, we use the PHP get_browser() function to format and return everything pretty and double check the code against $_SERVER['HTTP_USER_AGENT'].
//********************* || USER AGENT DETECTION FUNCTION ||
function DetectBrowserInfo() {
$UserAgent = get_browser(null, true);
$UserBrowser = $UserAgent['parent'];
$UserOS = $UserAgent['platform'];
if(DetectBot() != false) {
$UserBrowser = DetectBot();
} elseif (isset($UserBrowser)) {
$UserBrowser = $UserBrowser;
} else {
$UserBrowser = "BOT: Unknown";
}
if($UserOS != "unknown") {
$UserOS = $UserOS;
} else {
$UserOS = "BOT: N/A For Traffic";
}
$UserAgentDetails = array($UserBrowser, $UserOS);
return $UserAgentDetails;
}
{ 1 trackback }
{ 1 comment… read it below or add one }
sometimes bad robot using user agent same like search engine bot