python3.7---爬取网页图片

发布时间：2019-07-19 09:56:49编辑：auto阅读（2590）

#!/usr/bin/python

import re
import urllib
import urllib.request #python3中urlopen、urlritrieve都在request库里面了，所以要导入此库

def htmlGet(url):
page = urllib.request.urlopen(url)
html = page.read()
return html

def imgGet(html):
res = r'src="(https.*?.jpg)"'
imgre = re.compile(res)
imglist = re.findall(imgre,html.decode("utf-8")) #html不加后面的会报错typeerror，因为编码格式的变化，这里需要指定一下
x = 0
for i in imglist:
urllib.request.urlretrieve(i,"%s.jpg" % x)
x+=1

html = htmlGet("http://***")
imgGet(html)

关键字：

上一篇： python提取文件名改进

下一篇： Python字符串，列表



Run博客上线，欢迎访问
内容如有侵犯，请立即联系管理员删除
本站内容仅供学习和参阅，不做任何商业用途

搜索

热门推荐

H3C基本命令大全
 53172
H3C IRF原理及配置
 40088
Python exit()函数
 34461
python全系列官方中文文档
 30202
python 获取网卡实时流量
 25115
1.常用turtle功能函数
 24904
python 获取Linux和Windows硬件信息
 23296
天天基金网数据接口
 16802
Selenium使用代理IP&无头模式访问网站
 14909
Selenium&Pytesseract模拟登录+验证码识别
 14417

最新文章

LangGraph Studio可视化
 778°
LangSmith开发-应用入门
 712°
LangGraph开发-多轮对话问答机器人
 778°
LangGraph开发-条件分支/循环图实战
 788°
LangGraph开发-生态介绍，入门demo实战
 841°
LangChain-接入12306-HTTP MCP智能体
 966°
LangChain接入自定义爬虫-MCP工具
 944°
LangChain接入Filesystem-MCP工具
 960°
LangChain搭建MCP服务端和客户端流程
 1056°
LangGraph与MCP技术概述
 969°

博主信息

姓名：Run
职业：谜
邮箱：383697894@qq.com
定位：上海 · 松江

扫我打开

友情链接

百度 淘宝 腾讯 慕课网 CSDN 博客园 51cto博客