网页爬虫 - 为什么python模拟登陆 appannie一直返回503 code
问题描述
#-*-encoding:utf-8-*-import requests, xlwt, sysfrom bs4 import BeautifulSoupreload(sys)referer = 'https://www.appannie.com/account/login/?_ref=header'user_agent = (’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36’)sys.setdefaultencoding(’utf-8’)header = {'User-Agent': user_agent, 'Referer': referer, 'Host': 'www.appannie.com', ’Connection’: ’keep-alive’, ’Accept’: ’application/json, text/plain,*/*’, ’Accept-Encoding’: ’gzip, deflate, sdch’, ’Accept-Language’: ’zh-CN,zh;q=0.8’, ’X-NewRelic-ID’: ’VwcPUFJXGwEBUlJSDgc=’, ’X-Requested-With’: ’XMLHttpRequest’, }def main(): url = ’https://www.appannie.com/account/login/’ # content = requests.get(url,headers = header).content # soup = BeautifulSoup(content,’lxml’) # key = soup.select() s = requests.Session() s.get(url,headers = header) key = s.cookies[’csrftoken’] data = { ’csrfmiddlewaretoken’: key , ’next’: ’/dashboard/home/’ , ’username’:’1195615991@qq.com’ , ’password’:’xxxxx’ } req = s.post(url,data = data) if 2 != req.status_code / 100 :raise Exception('Error while logging in, code: %d' % (req.status_code)) cookies = req.cookies n = ’2017-04-11’ url_1 = ’https://www.appannie.com/apps/google-play/top-chart/?country=US&category=game&device=&date={}’.format(n) req_1 = s.get(url_1,headers = header,cookies = cookies).content #print req_1 soup = BeautifulSoup(req_1,’lxml’) print soup # ids = soup.find_all(’span’) # for id in ids : # name = id.get(’title’) # print nameif __name__ == ’__main__’: main()
问题解答
回答1:两个关键点:1. headers的user-agent2. csrfmiddlewaretoken参数
# coding: utf-8import requestsurl = ’https://www.appannie.com/account/login’session = requests.Session()session.headers[’user-agent’] = ’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36’session.get(url)token = session.cookies.get(’csrftoken’)data = { ’csrfmiddlewaretoken’: token, ’next’:’/dashboard/home/’, ’username’:’XXXX’, ’password’:’XXXX’}r = session.post(url, data)print r.status_code
相关文章:
1. vue ajax请求回来的数据没有渲染到页面2. javascript - node.js中stat() access() open() readFile()都能判断文件是否存在?3. 一个mysql联表查询的问题4. html的qq快捷登录怎么搞?求个源码5. mysql - select查询多个纪录的条件怎么写6. python中def定义的函数加括号和不加括号的区别?7. mysql - 分库分表、分区、读写分离 这些都是用在什么场景下 ,会带来哪些效率或者其他方面的好处8. mysql - 求SQL语句:查询某个值介于两个字段值之间的记录。9. mysql 能不能创建一个 有列级函数 的联合视图?10. 编程小白 问关于python当中类的方法的参数问题
