API入门项目项目收集GitHub上热门项目的信息

API入门项目项目收集GitHub上热门项目的信息

API是网站的一部分,常用在如果我们想要网站上的一些信息的时候,我们可以调用API请求数据再对这些数据可视化,并且这个数据还是实时的,大大提高了数据的可用性。

这次我们就用API调用GitHub上星级最高的python项目信息,并使用plotly生成交互式的可视化图表。

import requests#导入request模块
# 调用API并储存返回的响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
headers = {
     'Accept': 'application/vnd.github.v3+json'}#因为版本往往不一样,我们指定使用这个我们指定的API
r = requests.get(url, headers=headers)#用函数调用API
print(f"Status code: {r.status_code}")
#将响应赋给response_dict
response_dict= r.json()
#API返回的Json信息储存在response_dict
print (response_dict.keys())
#打印出来看看

https://api.github.com/search/repositories
关于这个地址,开头的https://api.github.com/是把请求发送到GitHub网站,接下里search是搜索,对象是所有的仓库repositories,q表示查询,=表示开始指定查询language:python是值要获取语言为python的信息,最后&sort=stars指定将项目按星排序

打印出来后是这样子的:状态码为200,响应字典只有三个键:[‘total_count’, ‘incomplete_results’, ‘items’]

Status code: 200
dict_keys([‘total_count’, ‘incomplete_results’, ‘items’])

import requests
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
headers = {
     'Accept': 'application/vnd.github.v3+json'}
r = requests.get(url, headers=headers)
print(f"Status code: {r.status_code}")
response_dict= r.json()

print(f"Total repositories: {response_dict['total_count']}")

# 探索全部仓库的信息
repo_dicts = response_dict['items']#打印与total_count相关的值,它指出了GitHub共有多少个仓库
print(f"Repositories returned: {len(repo_dicts)}")#将字典储存在repo_dicts

#我们可以来看下第一个仓库
repo_dict=repo_dicts[0]
print(f"\nKeys:{len(repo_dict)}")
for key in sorted(repo_dict.keys()):
     print(key)

#我们来提取一些repo_dict中于一些键相关联的值

这个是打印出来的值,我们可以从这里了解到实际出来的数据
Status code: 200
Total repositories: 6622696
Repositories returned: 30

Keys:74
archive_url
archived
assignees_url
太多了,跳过一部分
watchers
watchers_count

现在我们来提取一些repo_dict中于一些键相关联的值

import requests
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
headers = {
     'Accept': 'application/vnd.github.v3+json'}
r = requests.get(url, headers=headers)
print(f"Status code: {r.status_code}")
response_dict= r.json()
print(f"Total repositories: {response_dict['total_count']}")
repo_dicts = response_dict['items']
print(f"Repositories returned: {len(repo_dicts)}")

repo_dict=repo_dicts[0]
#我们来提取一些repo_dict中于一些键相关联的值
print("\nSelected information about each repository:")

print(f"Name: {repo_dict['name']}")#人名
print(f"Owner: {repo_dict['owner']['login']}")
print(f"Stars: {repo_dict['stargazers_count']}")#获得了多少个星
print(f"Repository: {repo_dict['html_url']}")
print(f"Created: {repo_dict['created_at']}")#项目创建的时间
print(f"Updated: {repo_dict['updated_at']}")#最后一次更新的时间
print(f"Description: {repo_dict['description']}")

结果就就是下面这样子:
Status code: 200
Total repositories: 6618376
Repositories returned: 30

Selected information about each repository:
Name: system-design-primer
Owner: donnemartin
Stars: 119890
Repository: https://github.com/donnemartin/system-design-primer
Created: 2017-02-26T16:15:28Z
Updated: 2021-01-31T02:19:49Z
Description: Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Process finished with exit code 0

理清了数据后,那么我们就可以总体开始了

import requests

from plotly.graph_objs import Bar#导入bar类
from plotly import offline#导入offline模块

# Make an API call and store the response.
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
headers = {
     'Accept': 'application/vnd.github.v3+json'}
r = requests.get(url, headers=headers)
print(f"Status code: {r.status_code}")

# 处理结果.
response_dict = r.json()
repo_dicts = response_dict['items']
repo_links, stars, labels = [], [], []#创建三个空列表用来存储我们要用的数据
for repo_dict in repo_dicts:#遍历repo_dicts中的所有的字典,打印项目的名称、所有者、星级等信息。
    repo_name = repo_dict['name']
    repo_url = repo_dict['html_url']
    repo_link = f"{repo_name}"
    repo_links.append(repo_link)

    stars.append(repo_dict['stargazers_count'])

    owner = repo_dict['owner']['login']
    description = repo_dict['description']
    label = f"{owner}
{description}"
labels.append(label) #开始可视化,定义列表data data = [{ 'type': 'bar', 'x': repo_links, 'y': stars, 'hovertext': labels, 'marker': { 'color': 'rgb(60, 100, 150)', 'line': { 'width': 1.5, 'color': 'rgb(25, 25, 25)'} }, 'opacity': 0.6, }] #使用字典定义表格的布局 my_layout = { 'title': 'Most-Starred Python Projects on GitHub', 'titlefont': { 'size': 28}, 'xaxis': { 'title': 'Repository', 'titlefont': { 'size': 24}, 'tickfont': { 'size': 14}, }, 'yaxis': { 'title': 'Stars', 'titlefont': { 'size': 24}, 'tickfont': { 'size': 14}, }, } fig = { 'data': data, 'layout': my_layout} offline.plot(fig, filename='python_repos.html')

最后我们可以生成一个可视化的html文件在浏览器打开

API入门项目项目收集GitHub上热门项目的信息_第1张图片

你可能感兴趣的:(可视化,python,json)