[Python]计算豆瓣电影TOP250的平均得分

用python写的爬虫练习,感觉比golang要好写一点。文章来源地址https://www.yii666.com/article/754188.html文章地址https://www.yii666.com/article/754188.html网址:yii666.com<网址:yii666.com文章来源地址:https://www.yii666.com/article/754188.html

 import re
import urllib origin_url = 'https://movie.douban.com/top250?start=00&filter='
urls = []
scores = [] def get_url():
step = 0
while step <= 250:
tmp = origin_url[:38]
tmp += str(step)
tmp += origin_url[40:]
urls.append(tmp)
step += 25 def get_html(url):
page = urllib.urlopen(url)
html = page.read()
return html def get_score(html):
score = []
reg = r'property="v:average">([0-9].[0-9])</span>'
score = re.findall(re.compile(reg), html)
return score def solve():
get_url()
for each in urls:
print each
scores.append(get_score(get_html(each)))
sum = 0
cnt = 0
for each in scores:
if cnt == 250: break
for i in range(0, len(each)):
if cnt == 250: break
cnt += 1
sum += float(each[i])
return sum / 250 print solve()

版权声明:本文内容来源于网络,版权归原作者所有,此博客不拥有其著作权,亦不承担相应法律责任。文本页已经标记具体来源原文地址,请点击原文查看来源网址,站内文章以及资源内容站长不承诺其正确性,如侵犯了您的权益,请联系站长如有侵权请联系站长,将立刻删除

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信图片_20190322181744_03.jpg

微信扫一扫打赏

请作者喝杯咖啡吧~

支付宝扫一扫领取红包,优惠每天领

二维码1

zhifubaohongbao.png

二维码2

zhifubaohongbao2.png