Website owners are not happy with the aggressive crawling activity by bots from OpenAI and Anthropic.
Many are expressing concern including game designers who are tired of online databases coming under attack. This makes sense because when you spend years of time, money, and effort to curate something you love, you’re going to be upset if an AI bot comes to destroy it all.
Websites are slowing down due to aggressive crawling behavior. This means they’re taking nearly three times as long to load pages. Similarly, they’re facing a lot of 502 bad gateway errors and that means the homepage was reloaded nearly 200 times each second.
Upon checking, website owners noticed that floods of traffic arose from a single IP address under the ownership of OpenAI. Now big tech giants have unleashed botnets that scour websites to get all kinds of information that fuels models.
They are after high-quality training data but whatever information they can take, they do. It’s a race to gather as much data as possible. Some studies speak about how the world’s data can run out in the next few years if AI data training of this kind continues to take place.
Now the problem is that this causing major financial issues for companies because they have to pay higher cloud bills. Those with limited resources are struggling out there. It’s not easy to be a host to a high number of bots that serve as serious burdens.
Website owners and game developers are operating at losses in today’s world. That’s how bad things have become OpenAI and Anthropic are huge names in the world of AI and it’s hard to fight against them.
So many website owners have come forward complaining of the same issue and how redesigning websites is getting harder with time. Traffic jumps and costs linked to cloud computing double what was seen in the past.
Most of the traffic is nonsense and it gives rise to repeated requests that result in 404 errors. The only way out for now or a temporary solution seems to be Robots.txt. This protocol enables bot crawlers to know where it is they can and cannot go.
Furthermore, research is showing how restrictions for Robots.txt have skyrocketed. A study from April 2023 to April 2024 showed 5% online data and 25% highest quality data adding restrictions for botnets.
Meanwhile, another study spoke about how the figures jumped for companies like Google, Anthropic, and OpenAI. So many data owners banned activities like crawling in their Terms of Service. However, they did not have any robots.txt restrictions in place.
When AI bots keep taking over a website, traffic metrics are impacted and they get out of sync. This causes issues for sites and advertisements done online. So as you can see, AI is coming to make changes in the world. Now who will bear the cost is a question worth mentioning.
Image: DIW-Aigen
Read next:
• YouTube Tests New Feature That Removes Age Restrictions On Videos And Helps Restore Content Impacted By Community Guideline Violations
• New Research Shows Most of the Links on AI Google Overviews are for Informational Intent
Many are expressing concern including game designers who are tired of online databases coming under attack. This makes sense because when you spend years of time, money, and effort to curate something you love, you’re going to be upset if an AI bot comes to destroy it all.
Websites are slowing down due to aggressive crawling behavior. This means they’re taking nearly three times as long to load pages. Similarly, they’re facing a lot of 502 bad gateway errors and that means the homepage was reloaded nearly 200 times each second.
Upon checking, website owners noticed that floods of traffic arose from a single IP address under the ownership of OpenAI. Now big tech giants have unleashed botnets that scour websites to get all kinds of information that fuels models.
They are after high-quality training data but whatever information they can take, they do. It’s a race to gather as much data as possible. Some studies speak about how the world’s data can run out in the next few years if AI data training of this kind continues to take place.
Now the problem is that this causing major financial issues for companies because they have to pay higher cloud bills. Those with limited resources are struggling out there. It’s not easy to be a host to a high number of bots that serve as serious burdens.
Website owners and game developers are operating at losses in today’s world. That’s how bad things have become OpenAI and Anthropic are huge names in the world of AI and it’s hard to fight against them.
So many website owners have come forward complaining of the same issue and how redesigning websites is getting harder with time. Traffic jumps and costs linked to cloud computing double what was seen in the past.
Most of the traffic is nonsense and it gives rise to repeated requests that result in 404 errors. The only way out for now or a temporary solution seems to be Robots.txt. This protocol enables bot crawlers to know where it is they can and cannot go.
Furthermore, research is showing how restrictions for Robots.txt have skyrocketed. A study from April 2023 to April 2024 showed 5% online data and 25% highest quality data adding restrictions for botnets.
Meanwhile, another study spoke about how the figures jumped for companies like Google, Anthropic, and OpenAI. So many data owners banned activities like crawling in their Terms of Service. However, they did not have any robots.txt restrictions in place.
When AI bots keep taking over a website, traffic metrics are impacted and they get out of sync. This causes issues for sites and advertisements done online. So as you can see, AI is coming to make changes in the world. Now who will bear the cost is a question worth mentioning.
Image: DIW-Aigen
Read next:
• YouTube Tests New Feature That Removes Age Restrictions On Videos And Helps Restore Content Impacted By Community Guideline Violations
• New Research Shows Most of the Links on AI Google Overviews are for Informational Intent