Sharing knowledge is good. But some companies do not give credit, where credit is due. So people are creating ways to protect content from leechers.
For this website I started looking how to add something. I don’t want to block all bots. For example, I like that my web site is indexed by the Internet Archive. Send them money if you can.
To tackle the AI problem, one of my local PHP usergroup organisers created a solution called VolkswAIgen.
It is all standards compliant and modern. Usually I don't write PHP code that way, because I mostly work with legacy stuff. The kind without documentation, no tests, and every developer added their own coding style. Obviously with no budget and the deadline was yesterday.
Also, I mostly do frontend stuff these days.
So here is my quick and dirty solution. First time I used a cache pool in PHP 🙌 Change the ingredients to your own liking 😉
For WordPress you could try the DefAI plugin. It uses the same library.
Install dependencies
composer require league/flysystem-local matthiasmullie/scrapbook volkswaigen/volkswaigen
Add PHP code
// Add the Composer autoloader (if not already added)
require __DIR__ . '/../vendor/autoload.php';
// Wrap in an anonymous function to not pollute the global namespace.
(function () {
// Initialise variables first.
$adapter = new \League\Flysystem\Local\LocalFilesystemAdapter(__DIR__ . '/../cache', null, LOCK_EX);
$filesystem = new \League\Flysystem\Filesystem($adapter);
$cache = new \MatthiasMullie\Scrapbook\Adapters\Flysystem($filesystem);
$cachePool = new \MatthiasMullie\Scrapbook\Psr6\Pool($cache);
$volkswaigen = new \VolkswAIgen\VolkswAIgen\Main(
new \VolkswAIgen\VolkswAIgen\ListFetcher($cachePool)
);
$userAgent = '';
if (isset($_SERVER['HTTP_USER_AGENT'])) {
$userAgent = $_SERVER['HTTP_USER_AGENT'];
}
$ipAddress = '';
if (isset($_SERVER['REMOTE_ADDR'])) {
$ipAddress = $_SERVER['REMOTE_ADDR'];
}
if (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ipAddress = $_SERVER['HTTP_X_FORWARDED_FOR'];
}
// Is it a bot?
if ($volkswaigen->isAiBot($userAgent, $ipAddress)) {
// It's AI! Feed them the good stuff.
\http_response_code(418);
echo '<h1>I’m a teapot</h1>';
exit;
}
})();