By Tom Howe, Director of Field Engineering, Hydrolix
Most traffic on the internet today is driven by bots. AI-driven bot traffic has surged 300% in the past year. At Hydrolix, we saw one company face a six figure overcharge by their Internet Service Provider due to AI-driven bot traffic. It is creating an environment where automated abuse is a primary driver of fraud, traffic volatility, and customer friction too.
Between 14% and 22% of paid search clicks are invalid, with bots driving nearly 40% of that fraudulent activity. Meanwhile, credential-based attacks fueled by infostealer malware and leaked identity data continue to rise at triple-digit rates year over year. This nonsense traffic is costing companies a fortune, wasting resources, and opening opportunities for detrimental cyber attacks.
Combating these bots requires a comprehensive bot strategy, one that includes classification because not all bots are bad, and instant detection to shut down those that are. Multiple stakeholders are involved in this effort. Marketing teams need to worry about SEO, overall traffic, and now GEO. Revenue teams focus on monetization opportunities. Legal departments consider intellectual property rights. Product and UX teams want to optimize user experience and site performance. Security teams need to assess and mitigate risk. Meanwhile, ops and financial teams are concerned about infrastructure costs. If 25% of all your traffic is “good” bots hoovering up content, that can lead to major CDN and cloud provider bills.
For all those stakeholders, strategic bot management means being able to answer specific questions about traffic such as:
What percentage of our overall traffic comes from bots (both good and bad)?
How has that percentage changed over the past year?
What proportion of “benign” bot traffic is AI-related versus traditional search crawlers? (GEO vs SEO)
Are bots being served cached content efficiently, or are they creating unnecessary load on our origin servers?
As agentic AI becomes more common, how are agents interacting with our web content?
The ability to segment benign bot traffic by category, such as search engines, social media preview bots, AI crawlers, and monitoring services, will help craft more nuanced policies. Companies might welcome some agents that provide business value while restricting others. They could share some AI-optimized content with crawlers while restricting access to content intended for human users. Or they could maximize the value of licensing agreements by allowing AI companies that have negotiated licensing agreements while blocking others.
Understanding and making fast decisions about bot traffic requires analyzing bot behavior over time. Traditional bot management tools limit visibility to 30 days, mainly due to the cost of retaining terabyte to petabyte-scale data and the inability to handle that volume.
Without long-term retention, companies can’t answer vital questions such as – How is bot behavior changing over time? Are our blocking policies working? They can’t understand which bots drive value versus which ones only extract it. They can’t understand seasonal patterns, validate long-term policy success, or track how bot behavior evolves in response to their actions with such limited data.
So where do companies begin building a sustainable bot strategy that focuses on identifying, classifying and combating AI-driven bots in real time, while retaining voluminous data, affordably?
The first step in this journey isn’t blocking bots or negotiating licensing deals. It’s gaining comprehensive visibility into what legitimate bots are actually doing on their properties. That means not just today or the last thirty days, but over the last year and beyond. It requires a data platform that can ingest, correlate and analyze terabyte-scale data in real-time to show which bots are threatening vs. benign, the impact they are causing and their origin.
Mitigating and blocking malicious bot activity is mission critical, especially as bot attacks increase in volume and become more sophisticated. However, understanding benign traffic is also critical. The impact of this traffic is more nuanced and ambiguous. Sometimes it should be blocked, sometimes encouraged. Failure to understand this seismic shift will lead to negative outcomes and missed opportunities. Because it’s not as simple as allow or deny. The next generation of bots and agentic AI can help transform businesses and lead to new revenue streams.