无聊的一天_一人互联网公司背后的无聊技术

无聊的一天

Listen Notes is a podcast search engine and database. The technology behind Listen Notes is actually very very boring. No AI, no deep learning, no blockchain. “Any man who must say I am using AI is not using True AI” :)

Listen Notes是一个播客搜索引擎和数据库。 Listen Notes背后的技术实际上非常无聊。 没有AI,没有深度学习,没有区块链。 “任何必须说我正在使用AI的人都没有使用True AI” :)

After reading this post, you should be able to replicate what I build for Listen Notes or easily do something similar. You don’t need to hire a lot of engineers. Remember, when Instagram raised $57.5M and got acquired by Facebook for $1B, they had only 13 employees — not all of them were engineers. The Instagram story happened in early 2012. It’s 2019 now, it’s more possible than ever to build something meaningful with a tiny engineering team — even one person.

阅读这篇文章后,您应该能够复制我为Listen Notes构建的内容或轻松地执行类似的操作。 您不需要雇用很多工程师。 请记住, 当Instagram筹集了5750万美元并被Facebook以10亿美元的价格收购时 ,他们只有13名员工 -并不是所有人都是工程师。 Instagram故事发生在2012年初。现在是2019年,现在比以往任何时候都更有可能用一个很小的工程团队甚至一个人来构建有意义的东西。

If you haven’t used Listen Notes yet , try it now:

如果您尚未使用Listen Notes,请立即尝试:

https://www.listennotes.com/

https://www.listennotes.com/

总览 (Overview)

Let’s start with requirements or features of this Listen Notes project.

让我们从这个Listen Notes项目的需求或功能开始。

Listen Notes provides two things to end users:

侦听笔记为最终用户提供了两件事:

  • A website ListenNotes.com for podcast listeners. It provides a search engine, a podcast database, Listen Later playlists, Listen Clips that allows you to cut a segment of any podcast episode, and Listen Alerts that notifies you when a specified keyword is mentioned in new podcasts on the Internet.

    一个用于播客收听者的网站ListenNotes.com 。 它提供了一个搜索引擎,一个播客数据库,“ 稍后收听”播放列表,一个允许您剪切任何播客片段的“ 收听剪辑”以及当Internet上新的播客中提及指定关键字时通知您的“ 收听警报” 。

  • Podcast Search & Directory APIs for developers. We need to track the API usage, get money from paid users, do customer support, and more.

    面向开发人员的播客搜索和目录API 。 我们需要跟踪API的使用情况,从付费用户那里赚钱,提供客户支持等等。

I run everything on AWS. There are 20 production servers (as of May 5, 2019):

我在AWS上运行所有内容。 有20个生产服务器(截至2019年5月5日):

You can easily guess what does each server do from the hostname.

您可以从主机名轻松猜测每个服务器的功能。

  • production-web serves web traffics for ListenNotes.com.

    production-web为ListenNotes.com提供网络流量。

  • production-api serves api traffics. We run two versions of API (as of May 4, 2019), thus v1api (the legacy version) and v2api (the new version).

    production-api提供api流量。 我们运行两个版本的API(截至2019年5月4日),即v1api(旧版本)和v2api(新版本)。

  • production-db runs PostgreSQL (primary and replica)

    production-db运行PostgreSQL(主数据库和副本数据库)

  • production-es runs an Elasticsearch cluster.

    生产运行 Elasticsearch集群。

  • production-worker runs offline processing tasks to keep the podcast database always up-to-date and to provide some magical things (e.g., search result ranking, episode/podcast recommendations…).

    生产人员运行离线处理任务,以使播客数据库始终保持最新状态并提供一些神奇的东西(例如,搜索结果排名,剧集/播客推荐…)。

  • production-lb is the load balancer. I also run Redis & RabbitMQ on this server, for convenience. I know this is not ideal. But I’m not a perfect person :)

    production-lb是负载平衡器。 为了方便起见,我还在此服务器上运行Redis和RabbitMQ。 我知道这并不理想。 但是我不是一个完美的人:)

  • production-pangu is the production-like server that I sometimes run one-off scripts and test changes. What’s the meaning of “pangu”?

    production-pangu是类似于生产的服务器,有时我会运行一次性脚本并测试更改。 盘古是什么意思?

Most of these servers can be horizontally scaled. That’s why I name them production-something1, production-something2… It could be very easy to add production-something3 and production-something4 to the fleet.

这些服务器大多数都可以水平扩展。 这就是为什么我将它们命名为production-something1production-something2 …可能很容易在车队中添加production-something3production-something4

后端 (Backend)

The entire backend is written in Django / Python3. The operating system of choice is Ubuntu.

整个后端是用Django / Python3编写的。 选择的操作系统是Ubuntu。

I use uWSGI to serve web traffics. I put NGINX in front of uWSGI processes, which also serves as load balancer.

我使用uWSGI服务网络流量。 我将NGINX放在uWSGI流程的前面,它也用作负载均衡器。

The main data store is PostgreSQL, which I’ve got a lot of development & operational experience over many years — battle tested technology is good, so I can sleep well at night. Redis is used for various purposes (e.g., caching, stats,…). It’s not hard to guess that Elasticsearch is used somewhere. Yes, I use Elasticsearch to index podcasts & episodes and to serve search queries, just like most boring companies.

主要的数据存储区是PostgreSQL ,我在多年的开发和运营经验中得到了广泛的应用-经过战斗测试的技术很好,因此我可以在晚上睡得很好。 Redis用于各种目的(例如,缓存,统计信息等)。 不难猜测, Elasticsearch用于某处。 是的,就像大多数 无聊的 公司一样,我使用Elasticsearch为播客和剧集编制索引并提供搜索查询服务 。

Celery is used for offline processing. And Celery Beat is for scheduling tasks, which is like Cron jobs but a bit nicer. If in the future Listen Notes gains traction and Celery & Beat cause some scaling issues, I probably will switch to the two projects I did for my previous employer: ndkale and ndscheduler.

芹菜用于脱机处理。 Celery Beat用于计划任务,就像Cron作业一样,但是要好一些。 如果将来Listen Notes 越来越受欢迎 ,并且Celery&Beat引起了一些扩展问题,那么我可能会切换到我为前任雇主所做的两个项目: ndkale和ndscheduler 。

Supervisord is used for process management on every server.

Supervisord用于每台服务器上的进程管理。

Wait, how about Docker / Kubernetes / serverless? Nope. As you gain experience, you know when not to over-engineer. I actually did some early Docker work for my previous employer back in 2014, which was good for a mid-sized billion-dollar startup but may be overkill for a one-person tiny startup.

等等,Docker / Kubernetes /无服务器怎么样? 不。 随着经验的积累,您知道什么时候不过度设计。 实际上,我早在2014年就为我的前任雇主做了一些Docker工作,这对一家规模十亿美元的中型初创公司来说是个好选择,但对于一个人的小型初创公司来说却有些过头了。

前端 (Frontend)

The web frontend is primarily built with React + Redux + Webpack + ES. This is pretty standard nowadays. When deploying to production, JS bundles would be uploaded to Amazon S3 and served via CloudFront.

Web前端主要由React + Redux + Webpack + ES 构建 。 如今这是非常标准的。 部署到生产中时,JS捆绑包将被上传到Amazon S3并通过CloudFront提供服务 。

On ListenNotes.com, most web pages are half server-side rendered (Django template) and half client-side rendered (React). The server-side rendered part provides a boilerplate of a web page, and the client-side rendered part is basically an interactive web app. But a few web pages are rendered entirely via server side, because of my laziness to make things perfect & some potential SEO goodies.

在ListenNotes.com上 ,大多数网页是服务器端渲染的一半( Django模板 )和客户端渲染的一半( React )。 服务器端渲染的部分提供了网页的样板,而客户端渲染的部分基本上是一个交互式Web应用程序。 但是有些网页完全通过服务器端呈现,这是因为我懒于使事情变得完美以及一些潜在的SEO好东西。

音频播放器 (Audio player)

I use a heavily modified version of react-media-player to build the audio player on ListenNotes.com, which is used in several places, including Listen Notes Website, Twitter embedded player, and embedded player on 3rd party websites:

我使用了一个经过重大修改的react-media-player来在ListenNotes.com上构建音频播放器 ,该音频播放器已在多个地方使用,包括Listen Notes网站 , Twitter嵌入式播放器和第三方网站上的嵌入式播放器:

播客API (Podcast API)

We provide a simple and reliable podcast API to developers. Building the API is similar to building the website. I use the same Django/Python stack for the backend, and ReactJs for the frontend (e.g., API dashboard, documentation…).

我们为开发人员提供了一个简单可靠的播客API 。 构建API类似于构建网站 。 我将相同的Django / Python堆栈用于后端,将ReactJs用于前端(例如,API仪表板,文档…)。

For the API, we need to track how many requests a user use in current billing cycle, and charge $$$ at the end of a billing cycle. It’s not hard to imagine that Redis is heavily used here :)

对于API,我们需要跟踪用户在当前计费周期中使用了多少个请求,并在计费周期结束时向$$$收取费用。 不难想象Redis在这里被大量使用:)

开发运维 (DevOps)

机器配置和代码部署 (Machine provisioning & code deployment)

I use Ansible for machine provisioning. Basically, I wrote a bunch of yaml files to specify what type of servers need to have what configuration files & what software. I can spin up a server with all correct configuration files & all software installed with one button push. This is the directory structure of those Ansible yaml files:

我使用Ansible进行机器配置。 基本上,我写了一堆yaml文件来指定哪种类型的服务器需要什么配置文件和什么软件。 我可以一键启动服务器,其中包含所有正确的配置文件和所有安装的软件。 这是这些Ansible yaml文件的目录结构:

I also use Ansible to deploy code to production. Basically, I have a wrapper script deploy.sh that is run on macOS:

我还使用Ansible将代码部署到生产中。 基本上,我有一个在macOS上运行的包装脚本deploy.sh

./deploy.sh production HEAD web

./deploy.sh生产HEAD网站

The deploy.sh script takes three arguments:

deploy.sh脚本采用三个参数:

  • Environment: production or staging.

    环境 :生产或分期。

  • Version of the listennotes repo: HEAD means “just deploy the latest version”. If a SHA of a git commit is specified, then it’ll deploy a specific version of code — this is particularly useful when I need to rollback from a bad deployment.

    listennotes回购的版本 :HEAD表示“只需部署最新版本”。 如果指定了git commit的SHA,则它将部署特定版本的代码-当我需要从不良部署中回滚时,这特别有用。

  • What kind of servers: web, worker, api, or all. I don’t have to deploy to all servers all at once. Sometimes I make changes on Javascript code, then I just need to deploy to web, without touching api or worker.

    什么样的服务器 :Web,Worker,API或全部。 我不必一次全部部署到所有服务器。 有时,我需要对Javascript代码进行更改,然后只需要部署到Web即可,而无需接触api或worker。

The deployment process is mostly orchestrated by Ansible yaml files, and of course, it’s dead simple:

部署过程主要由Ansible yaml文件编排,当然,这非常简单:

  • On my Macbook Pro, if it’s to deploy to web servers, then build Javascript bundles and upload to S3.

    在Macbook Pro上 ,如果要部署到Web服务器,则构建Javascript捆绑包并上传到S3。

  • On the target servers, git clone the listennotes repo to a timestamp-named folder, check out the specific version, and pip install new Python dependencies if any.

    在目标服务器上 ,git将listennotes存储库克隆到一个以时间戳记的文件夹,签出特定版本,然后pip安装新的Python依赖项(如果有)。

  • On the target servers, switch symlink to the above timestamp-named folder and restart servers via supervisorctl.

    在目标服务器上 ,将symlink切换到上面的时间戳命名的文件夹,然后通过超级用户重启。

As you can see, I don’t use those fancy CI tools. Just dead simple things that actually work.

如您所见,我没有使用那些精美的CI工具。 只是简单的实际工作已经死了。

监控和警报 (Monitoring & alerting)

I use Datadog for monitoring & alerting. I’ve got some high level metrics in a simple dashboard. Whatever I do here is to boost my confidence when I am messing around the production servers.

我使用Datadog进行监视和警报。 我在一个简单的仪表板中获得了一些高级指标。 在搞乱生产服务器时,我在这里所做的一切都是为了增强我的信心。

I connect Datadog to PagerDuty. If something goes wrong, PagerDuty will send me alerts via phone call & SMS.

我将Datadog连接到PagerDuty。 如果出现问题, PagerDuty将通过电话和短信向我发送警报。

I also use Rollbar to keep an eye on the health of Django code, which will catch unexpected exceptions and notify me via email & Slack as well.

我还使用Rollbar来监视Django代码的运行状况,该代码将捕获意外的异常并通过电子邮件和Slack通知我。

I use Slack a lot. Yes, this is a one-person company, so I don’t use Slack for communicating with human beings. I use Slack to monitor interesting application-level events. In addition to integrating Datadog and Rollbar with Slack, I also use Slack incoming webhooks in Listen Notes backend code to notify me whenever a user signs up or performs some interesting actions (e.g., adding or deleting things). This is a very common practice in tech companies. When you read some books about Amazon or PayPal’s early history, you’ll know that both companies had similar notification mechanism: whenever a user signed up, there would be a “ding” sound to notify everyone in the office.

我经常使用Slack 。 是的,这是一家一人的公司,所以我不使用Slack与人交流。 我使用Slack监视有趣的应用程序级事件。 除了将Datadog和Rollbar与Slack集成之外,我还使用Listen Notes后端代码中的Slack传入Web钩子在用户注册或执行某些有趣的操作(例如,添加或删除内容)时通知我。 这是科技公司非常普遍的做法。 当您阅读有关Amazon或PayPal早期历史的书籍时,您会知道两家公司都有类似的通知机制:每当用户注册时,都会有“叮”的声音通知办公室中的所有人。

Since launched in early 2017, Listen Notes hasn’t got any big outage (> 5 minutes) except for this one. I’m always very careful & practical in these operational stuffs. The web servers are significantly over-provisioned, just in case there’s some huge spike due to press events or whatever.

自从在2017年年初推出,听说明没有得到任何大停电(> 5分钟),除了这一个 。 在这些操作方面,我总是非常谨慎和实用。 Web服务器的配置严重超载,以防万一由于新闻事件或其他原因而导致一些高峰。

发展历程 (Development)

I work in a WeWork coworking space in San Francisco. Some people may wonder why not just work from home or from some random coffee shops. Well, I value productivity a lot and I’m willing to invest money in productivity. I don’t believe piling time helps software development (or any soft of knowledge/creativity work). It’s rare that I work over 8 hours in a day (Sorry, 996 people). I want to make every minute count. Thus, a nice & relatively expensive private office is what I need :) Instead of optimizing for spending more time & saving money, I optimize for spending less time & making money :)

我在旧金山的WeWork协同工作空间中工作。 有些人可能会奇怪,为什么不在家工作还是在一些随机的咖啡店工作。 好吧,我非常重视生产力,并且愿意为生产力投资。 我认为打桩时间不会帮助软件开发(或任何知识/创造力工作)。 我一天工作8小时以上是很罕见的(抱歉,有996人 )。 我想把每一分钟都算在内。 因此,我需要一个不错且相对昂贵的私人办公室:)我没有花更多的时间和金钱来进行优化,而是花更少的时间和金钱来进行了优化:)

I’m using a MacBook Pro. I run the (almost) identical infrastructure inside Vagrant + VirtualBox. I use the same set of Ansible yaml files as described above to provision the development environment inside Vagrant.

我正在使用MacBook Pro。 我在Vagrant + VirtualBox中运行(几乎)相同的基础架构。 我使用与上述相同的Ansible yaml文件集来提供Vagrant内部的开发环境。

I subscribe to the monolithic repo philosophy. So there’s one and only one listennotes repo, containing DevOps scripts, frontend & backend code. This listennotes repo is hosted as a GitHub private repo. I do all development work on the main branch. I rarely use feature branches.

我赞同整体回购哲学。 因此,只有一个listennotes回购,其中包含DevOps脚本,前端和后端代码。 该listennotes存储库托管为GitHub私有存储库。 我在主分支上进行所有开发工作。 我很少使用功能分支。

I write code and run the dev servers (Django runserver & webpack dev server) by using PyCharm. Yea, I know, it’s boring. After all, it’s not Visual Studio Code or Atom or whatever cool IDEs. But PyCharm works just fine for me. I’m old school.

我编写代码并使用PyCharm运行开发服务器(Django runserver和webpack开发服务器)。 是的,我知道,这很无聊。 毕竟,它不是Visual Studio Code或Atom或任何出色的IDE。 但是PyCharm对我来说效果很好。 我是老学校。

(Miscellaneous)

There are a bunch of useful tools & services that I use to build Listen Notes as a product and a company:

我使用了大量有用的工具和服务来将Listen Notes打造为产品和公司:

  • iTerm2 and tmux for the terminal stuffs.

    iTerm2和tmux用于终端填充 。

  • Notion for TODO lists, wiki, taking notes, design documents…

    概念的TODO列表,维基,记笔记,设计文档...

  • G Suite for @listennotes.com email account, calendar, and other Google services.

    G Suite,用于@ listennotes.com电子邮件帐户,日历和其他Google服务。

  • MailChimp for sending the monthly email newsletter.

    MailChimp用于发送每月电子邮件时事通讯 。

  • Amazon SES for sending transactional & some marketing emails.

    Amazon SES,用于发送交易和一些营销电子邮件。

  • Gusto to pay myself and contractors who are not from Upwork.

    Gusto支付自己和非Upwork承包商的费用。

  • Upwork to find contractors.

    Upwork找到承包商。

  • Google Ads Manager to mange direct sales ads and track performance.

    Google Ads Manager可以管理直销广告并跟踪效果。

  • Carbon Ads and BuySellAds for fallback ads.

    备用广告的Carbon广告和BuySellAds 。

  • Cloudflare for DNS management, CDN, and firewall.

    Cloudflare用于DNS管理,CDN和防火墙。

  • Zapier and Trello to streamline the podcaster interview workflow.

    Zapier和Trello简化了播客采访工作流程。

  • Medium for the company blog (obviously).

    公司博客的媒介 (显然)。

  • Godaddy and Namecheap for domain names.

    Godaddy和Namecheap域名。

  • Stripe for getting money from users (primarily for API).

    从用户那里获取收益的条纹 (主要用于API )。

  • Google speech-to-text API to transcribe episodes.

    Google语音文本API可以转录剧集。

  • Kaiser Permanente for health insurance.

    Kaiser Permanente医疗保险。

  • Stripe Atlas to incorporate Listen Notes, Inc.

    Stripe Atlas合并了Listen Notes,Inc.

  • Clerky to generate legal documents for fund raising (SAFE) and hiring contractors who are not from Upwork.

    秘书为筹集资金(SAFE)生成法律文件,并雇用非Upwork的承包商。

  • Quickbooks for bookkeeping.

    的Quickbooks记账。

  • 1password to manage login credentials for tons of services.

    1密码来管理大量服务的登录凭据。

  • Brex for charge card — you can get incremental $5000 AWS credits, which can be applied on top of the AWS credits from WeWork or Stripe Atlas.

    Brex充值卡-您可以获得$ 5000的增量AWS信用,可以在WeWork或Stripe Atlas的AWS信用之上应用。

  • Bonvoy Business Amex Card — You can earn Marriott Bonvoy points for luxury hotels and flights. It’s the best credit card points for traveling :)

    Bonvoy商务美国运通卡 —您可以通过豪华酒店和航班获得Marriott Bonvoy积分。 这是旅行的最佳信用卡积分:)

  • Capital One Spark for checking account.

    Capital One Spark用于支票帐户。

保持冷静并进行… (Keep calm and carry on…)

As you can see, we are living in a wonderful age to start a company. There are so many off-the-shelf tools and services that save us time & money and increase our productivity. It’s more possible than ever to build something useful to the world with a tiny team (or just one person), using simple & boring technology.

如您所见,我们正处于一个成立公司的美好时代。 有许多现成的工具和服务可以节省我们的时间和金钱,并提高我们的生产率。 与一个无聊的团队(或一个人)一起使用简单而无聊的技术,比以往任何时候都更有可能构建对世界有用的东西。

As time goes, companies become smaller and smaller. You don’t need to hire tons of full-time employees. You can hire services (SaaS) and on-demand contractors to get things done.

随着时间的流逝,公司变得越来越小。 您无需雇用大量的全职员工。 您可以雇用服务(SaaS)和按需承包商来完成任务。

Most of time, the biggest obstacle of building & shipping things is over thinking. What if this, what if that. Boy, you are not important at all. Everyone is busy in their own life. No one cares about you and the things you build, until you prove that you are worth other people’s attention. Even you screw up the initial product launch, few people will notice. Think big, start small, act fast. It’s absolutely okay to use the boring technology and start something simple (even ugly), as long as you actually solve problems.

大多数时候,建造和运输物品的最大障碍是思考。 如果这样,那该怎么办。 男孩,你一点都不重要。 每个人都忙于自己的生活。 在证明自己值得别人关注之前,没有人关心您和您所建造的东西。 即使您搞砸了最初的产品发布,也很少有人会注意到。 大处着眼,从小处开始,快速行动 。 只要您真正解决问题,使用无聊的技术并开始一些简单的事情(甚至是丑陋的事情)绝对是可以的。

There are so many cargo-cult-type people now. Ignore the noises. Keep calm and carry on.

现在有很多崇拜货物的人。 忽略噪音。 保持冷静并进行。



If you haven’t used Listen Notes yet , try it now:

如果您尚未使用Listen Notes,请立即尝试:

https://www.listennotes.com/

https://www.listennotes.com/

翻译自: https://www.freecodecamp.org/news/the-boring-technology-behind-a-one-person-internet-company/

无聊的一天

你可能感兴趣的:(数据库,java,python,大数据,人工智能)