python BeautifulSoup

发布时间:2019-08-26 07:19:56编辑:auto阅读(1822)


    通过BeautifulSoup库的get_text方法找到网页的正文:

    #!/usr/bin/env python
    #coding=utf-8
    
    #HTML找出正文
    
    import requests
    from bs4 import BeautifulSoup
    
    url='http://www.baidu.com'
    html=requests.get(url)
    
    soup=BeautifulSoup(html.text)
    print soup.get_text()

     

关键字