随着新冠肺炎疫情在全球持续肆虐,掌握确诊病例、死亡病例的实时数据对于疫情防控十分重要。
多国媒体或政府卫生部门在进行疫情信息更新发布时,都在引用美国约翰斯·霍普金斯大学的疫情数据图。而其背后的创作者和维护者除了学校副教授劳伦·加德纳外,还有两名来自中国的博士生董恩盛和杜鸿儒。他们都是约翰斯·霍普金斯大学土木与系统工程系博士一年级学生。
图源:央视新闻客户端
As COVID-19 spreads across the globe, reporting data of new and total confirmed cases on a daily basis has become quite an important routine for news agencies around the world. Currently, most of the mainstream media are referencing the data from Johns Hopkins University's online dashboard when they report about the outbreak.
董恩盛(右)、杜鸿儒今年1月在美国参加学术会议。图源:Twitter
据央视新闻报道,数据地图由中国博士生董恩盛等人发起并维护。早在去年5、6月份的时候,董恩盛与导师劳伦·加德纳便在一个针对美国麻疹病毒风险性分析项目中做了一个类似的数据可视化地图,当时引来一些美国主流媒体的报道。所以,在技术思路上是比较成熟的,这份新冠肺炎疫情图表也很快就能调试上线。1月21日,在一次博士生组会上,董恩盛的导师、系统科学与工程中心副教授劳伦·加德纳和大家聊起新学期的计划,得知中国的新冠肺炎疫情后,便问恩盛是否做个数据图表。
董恩盛的研究方向是疾病模型,也就是用数学模型和计算机代码来解释一些流行病学、公共健康方面的问题,对全球流行病的发展趋势做基本的判断和推测。当时他已经在搜集数据准备做这个事情了,两人一拍即合,七八个小时后,第一版疫情可视化地图就做好了,1月22日,这个网站便正式面世了。
Launched on January 22, this real-time tracking map is created and maintained by Lauren Gardner, an associate professor in the Department of Civil and Systems Engineering at Johns Hopkins University, together with two Chinese students, Dong Ensheng and Du Hongru, first-year PhD students at the university's Center for Systems Science and Engineering.
"On January 21, we (Dong and his tutor) reached an agreement to make the interactive dashboard. I spent about seven to eight hours that night to complete the first edition. Then my tutor posted this dashboard on Twitter at around 11 a.m. on January 22," said Dong.
视频来源:央视新闻
董恩盛和导师做这个图表最开始的初衷只是为下一步的学术研究做数据收集和准备工作,没想到随着疫情发展,会成为全世界普遍关注的统计参考,这也让他和团队感到责任更重了,更需要夜以继日地保持数据严谨和准确。如今,这个网站已经成为多国政府高层、公共卫生学者和主流媒体引用最多的疫情数据来源,更新和运营这个网站成了董恩盛的“主业”。
Their original intention to create this dashboard was to collect data for academic research. However, with the development of the epidemic, it has become the most cited source of epidemic data for government officials, public health scholars and mainstream media in many countries. Updating and operating this website has become Dong's "main business."
起初,董恩盛和他的导师手动整理数据,每天早晚各更新一次。但随着疫情形势的变化,这种方式变得不可持续,于是项目转为半自动化更新,杜鸿儒也加入数据收集和图表的制作中来。
到了3月初,这个数据小组将美国疫情信息具体到县一级层面。“因为美国郡县大概有3000多个,再加上世界上有200多个国家和地区,人工完成非常吃力。”董恩盛告诉《中国新闻周刊》,于是小组招募了一些志愿者,把人员分了很多组,有的组负责国外,有的组负责美国国内不同地区,然后24小时不间断地发布最新数据。
董恩盛他们的团队也从最开始的两三个人,发展到现在包括本系其他博士研究生以及其他学院志愿者,再加上合作提供技术支持的公司,一共有近50人的团队。
Dong and his tutor updated the map data manually twice a day, in the morning and evening. But as the pandemic unfurled, they found that manual updates were unsustainable, so they decided to automate parts of it and invited Du to work with them.
As the coronavirus continues to spread, the data that needs to be tracked is increasing. Dong's team has gradually grown from two or three people to now nearly 50 people, including other doctoral students in the department, volunteers from other colleges and some technicians from the company which provided technical support.
董恩盛(左) 和杜鸿儒 (右) 图源:央视新闻
2月1日杜鸿儒加入后主要负责自动更新代码的编写以及将采集的数据和WHO发布的数据做对比,确保数据的一致性和准确性。
Joining the team on February 1, Du's main work is to write code for automatic updates and compare the data they collected with the numbers released by the World Health Organization (WHO), ensuring data consistency and accuracy.
杜鸿儒表示:“最难的就是这些数据源格式不同,语言通常也不同,我们需要把各个数据源汇总,整理、清洗成我们需要的格式,再上传到这个数据图表中。”
"The most difficult thing is that these data sources are all in different formats and often different languages. We need to gather each data source, organize and adjust them into the format we need, then upload it to the dashboard," said Du.
视频来源:央视新闻
根据约翰·霍普金斯大学数据图表官方介绍,其数据来源包括:世界卫生组织(WHO),中、美、欧的官方卫生和疾控部门,各地媒体,以及第三方数据平台如丁香园等。
The data sources include theWorld Health Organization, theU.S. Centers for Disease Control and Prevention, theEuropean Center for Disease Prevention and Control, theNational Health Commission of the People’s Republic of China, local media reports, local health departments, and theDXY, one of the world’s largest online communities for physicians, health care professionals, pharmacies and facilities.
在3月6日于华盛顿国会山举行的一场简介会上,劳伦·加德纳介绍,这个数据图表受到关注有一段时间了,现在平均每天点击量为10亿,最高峰一天点击达20亿。其间有过几次高峰,例如,当意大利疫情暴发时,许多意大利民众涌入网站,意大利用户数超过了美国。
图源:Facebook
董恩盛说:“这个是我们(疫情图)大概的使用量,是我们一个图层的使用量。截止到3月31日,我们在全球大概已经有155亿次的使用量了。”
As of March 31, the website had over 15.5 billion visits worldwide, said Dong.
视频来源:央视新闻
除了能亲身参与这样一项引起全球关注的项目给自身带来的荣誉感,以及在短时间内掌握多领域的专业知识,对董恩盛和杜鸿儒来说,维护这一网站对两人的责任感和学术严谨性都是一种锻炼和提升;另外,当疫情在全世界持续影响下,两人也都认为世界各国要加强合作,学习中国成功的防控经验,早日控制疫情在全球的蔓延。
杜鸿儒说:“从数据上来看,美国目前是全世界(疫情)最严重的。中国疫情防控对其他国家都是很好的榜样,我希望世界上各个国家可以参考中国的防控手段,希望能早日控制全球的疫情。”
综合来源:中国新闻周刊,CGTN,jhu.edu,央视新闻