一、图片src提取的基本原理
二、使用正则表达式提取图片src
<?php
$html = <<<HTML
<html>
<head><title>Test Page</title></head>
<body>
<img src="https://example.com/image1.jpg" alt="Image 1">
<img src="image2.jpg" alt="Image 2">
</body>
</html>
HTML;
$pattern = '/<img\s+[^>]*src="([^"]*)"[^>]*>/i';
preg_match_all($pattern, $html, $matches);
foreach ($matches[1] as $src) {
echo "Found image src: $src\n";
}
?>
三、使用DOM解析提取图片src
<?php
$html = <<<HTML
<html>
<head><title>Test Page</title></head>
<body>
<img src="https://example.com/image1.jpg" alt="Image 1">
<img src="image2.jpg" alt="Image 2">
</body>
</html>
HTML;
$dom = new DOMDocument();
@$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
$src = $image->getAttribute('src');
echo "Found image src: $src\n";
}
?>
这段代码首先加载HTML字符串到一个DOMDocument对象中。然后使用getElementsByTagName方法获取所有的<img>标签,并遍历它们以提取src属性值。
四、实战案例:自动下载网页图片
<?php
$html = file_get_contents('https://example.com');
$pattern = '/<img\s+[^>]*src="([^"]*)"[^>]*>/i';
preg_match_all($pattern, $html, $matches);
foreach ($matches[1] as $src) {
$image_name = basename($src);
file_put_contents("uploads/$image_name", file_get_contents($src));
echo "Downloaded $image_name\n";
}
?>