Top web infrastructure provider Cloudflare just shared the latest AI Labyrinth feature that’s designed to combat illegal data scarping by AI bots.
The company shared how it would serve fake AI content to those bots when an attempt for data scraping is done. So you can see how this tool will try to prevent AI firms from crawling different online pages without taking consent for data collection for training purposes.
The tool is going to be a major setback for a host of LLMs that power leading AI chatbots such as ChatGPT. The company is best known for providing infrastructure and security features for different websites. This is especially true for protection against DDoS attacks and other types of malicious traffic.
Instead of just blocking these bots, the new system lures people into a giant maze of realistic appearing but irrelevant context. This wastes the crawler’s time and resources. So the approach is a major shift from what we usually see in standard block and defend features that are used by different website protection services. As per Cloudflare, blocking out bots can backfire as it alerts the operators for these crawlers about detection so that might alter the course of action.
When they detect such kinds of illegal crawling techniques, they will link them to some AI-based pages that convince people to entice crawlers to overcome them. After taking a closer look, they’ll see how the content is not really the material on a site. Therefore, both time and resources are wasted simultaneously.
The security firm explained how the content served to such bots is very irrelevant to the page that gets crawled but gets sourced in an intricate manner using real facts supported by scientific evidence. The goal is to stop the spread of misinformation but that still remains a point worth pondering upon. Cloudflare wants to produce this kind of content by making use of Workers AI which is a commercial app that runs various types of AI tasks.
Such pages are made to trap bots and are designed by Cloudflare itself to ensure links continue to remain invisible and not accessible to usual visitors. Therefore, anyone browsing this web won’t run into it by accident.
AI Labyrinth works similarly to how Cloudflare dubs a next-gen honeypot attempt. Classic honeypots are mostly invisible and human visitors don’t see these bots parsing through HTML codes getting followed. However, the company mentioned how modern tools can get experts at identifying the traps and giving rise to better attempts to deceive.
The false links feature the right metadirectives for stopping the indexing on search engines while staying attractive to bots designed for scraping data. Now these tools are not something new but it’s a great effort from Cloudflare. Many others are available who try to join the growing race to counter all sorts of aggressive AI web crawling.
Image: DIW-Aigen
Read next: Big Shift Coming? Google Sets December 31 Deadline to Fix Search for Independent Creators
The company shared how it would serve fake AI content to those bots when an attempt for data scraping is done. So you can see how this tool will try to prevent AI firms from crawling different online pages without taking consent for data collection for training purposes.
The tool is going to be a major setback for a host of LLMs that power leading AI chatbots such as ChatGPT. The company is best known for providing infrastructure and security features for different websites. This is especially true for protection against DDoS attacks and other types of malicious traffic.
Instead of just blocking these bots, the new system lures people into a giant maze of realistic appearing but irrelevant context. This wastes the crawler’s time and resources. So the approach is a major shift from what we usually see in standard block and defend features that are used by different website protection services. As per Cloudflare, blocking out bots can backfire as it alerts the operators for these crawlers about detection so that might alter the course of action.
When they detect such kinds of illegal crawling techniques, they will link them to some AI-based pages that convince people to entice crawlers to overcome them. After taking a closer look, they’ll see how the content is not really the material on a site. Therefore, both time and resources are wasted simultaneously.
The security firm explained how the content served to such bots is very irrelevant to the page that gets crawled but gets sourced in an intricate manner using real facts supported by scientific evidence. The goal is to stop the spread of misinformation but that still remains a point worth pondering upon. Cloudflare wants to produce this kind of content by making use of Workers AI which is a commercial app that runs various types of AI tasks.
Such pages are made to trap bots and are designed by Cloudflare itself to ensure links continue to remain invisible and not accessible to usual visitors. Therefore, anyone browsing this web won’t run into it by accident.
AI Labyrinth works similarly to how Cloudflare dubs a next-gen honeypot attempt. Classic honeypots are mostly invisible and human visitors don’t see these bots parsing through HTML codes getting followed. However, the company mentioned how modern tools can get experts at identifying the traps and giving rise to better attempts to deceive.
The false links feature the right metadirectives for stopping the indexing on search engines while staying attractive to bots designed for scraping data. Now these tools are not something new but it’s a great effort from Cloudflare. Many others are available who try to join the growing race to counter all sorts of aggressive AI web crawling.
Image: DIW-Aigen
Read next: Big Shift Coming? Google Sets December 31 Deadline to Fix Search for Independent Creators