您的位置: 网站首页> requests爬虫> 当前文章
requests的get请求url传参及无效参数
老董-我爱我家房产SEO2019-07-18156围观,143赞
很多网站的url是带有参数的(http://www.xxx.com/get?key1=val1&key2=val2),比如在百度搜索www.python66.com,然后搜索结果页的url是很长的一串,取部分参数也可以访问如:https://www.baidu.com/s?tn=50000021_hao_pg&word=python66.com,requests对于这种带参数的url如何实现请求呢?
1、如何进行url传参
官方原文:You often want to send some sort of data in the URL’s query string. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val. Requests allows you to provide these arguments as a dictionary of strings, using the params keyword argument. As an example, if you wanted to pass key1=value1 and key2=value2 to httpbin.org/get, you would use the following code:
译文:Requests 允许你使用params关键字参数,以一个字符串字典来提供这些参数。举例来说,如果你想传递 key1=value1 和 key2=value2 到 httpbin.org/get ,那么你可以使用如下代码
# -*- coding: utf-8 -*- import requests payload = {'key1': 'value1', 'key2': 'value2'} r = requests.get("http://httpbin.org/get", params=payload)
同理,如果访问python66.com的百度搜索结果页就可以这样:
headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44', } payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'} r = requests.get("https://www.baidu.com", params=payload,headers=headers)
2、url传参的本质
# -*- coding: utf-8 -*- headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44', } payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'} r = requests.get("https://www.baidu.com", params=payload,headers=headers) print(r.url)
https://www.baidu.com/?word=50000021_hao_pg&tn=python66.com
观察上面的代码结果,url传参实际上和直接访问拼接好的url没有区别,只不过是requests在内部进行了处理,上述访问百度搜索结果页的例子可以直接按如下来写
# -*- coding: utf-8 -*- import requests headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44', } payload = {'word': '50000021_hao_pg', 'tn': 'python66.com'} r = requests.get("https://www.baidu.com/s?tn=50000021_hao_pg&word=python66.com", headers=headers)
3、无效传参
Note that any dictionary key whose value is None will not be added to the URL’s query string
PS:注意字典里值为None的键都不会被添加到URL的查询字符串里,也就是说字典的一个键的值为None,实际上等于url不添加这个参数
# -*- coding: utf-8 -*- headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44', } payload = {'word': '50000021_hao_pg', 'tn': None} r = requests.get("https://www.baidu.com", params=payload,headers=headers) print(r.url)
https://www.baidu.com/?word=50000021_hao_pg
很赞哦!
python编程网提示:转载请注明来源www.python66.com。
有宝贵意见可添加站长微信(底部),获取技术资料请到公众号(底部)。同行交流请加群
相关文章
文章评论
-
requests的get请求url传参及无效参数文章写得不错,值得赞赏