Disabling Google App Engine disabled my site’s database

A sad Firebase Firestore story

Unfortunately this happened last week, where some of https://like.co’s user encountered 502 timeout error.

Disabling Google App Engine disabled my site’s database_第1张图片
Not good

It turns out the Firebase Firestore database we were using were disabled and not responding to query.

What happened

We are using Firebase and Google Cloud Platform extensively for https://like.co’s service. We have separate production and staging clusters set up, but both in the same GCP project.

Before the issue occurred, our team member was testing on the new GAE standard node.js (By the way, it is great, but OOM easily on Nuxt.js SSR). After prototyping was finished, since he could not delete the only instance left in GAE without deleting the whole project, he decided to just disable GAE instead.

He somehow disabled the Firestore access, and caused a lot of server response timeout. We have setup a 5xx alert on nginx, but in this case it is a discardedresponse so unfortunately it did not trigger the alert. We find out the issue from user’s report eventually.

The Problem

It seems that Firebase Firestore relies on Cloud Datastore, and Cloud Datastore relies on enabled GAE.

Disabling Google App Engine disabled my site’s database_第2张图片
You were hinted

There were no mentioning about Cloud Datastore need to be enabled in the documentation, except this paragraph:You can’t use both Cloud Firestore and Cloud Datastore in the same project, which might affect apps using App Engine. Try using Cloud Firestore with a different project.

While it does not mention anything about disabling GAE will affects Firestore or not, there was a 2016 issue about stating Datastore needs GAE enabled. Firestore is probably running on a Cloud Datastore, and inherits this behaviour.

Thus disabling GAE cause Firestore to be not accessible too.

How to fix it

We simply re-enabled Google App Engine (and thus Cloud Datastore) and everything was working again.

Disabling Google App Engine disabled my site’s database_第3张图片
Sounds harmless? NO!  

Lessons Learned

Do not try new feature on production Google cloud platform project (how embarrassing…)

Split production and development environment not only on cluster level, but also on project level

Do not try to disable seemingly harmless feature on production before testing what will happen in a development environment

Setup more generic alert instead of ad-hoc(e.g. only 5xx) ones for monitoring

以下方式关注我们

官网:https://like.co/

Medium:medium.com/likecoin/zh/

Facebook:fb.com/LikeCoin.Foundation,fb.com/groups/likecoin

电报群:t.me/likecoin

Twitter:twitter.com/likecoin_fdn

Youtube:youtube.com/c/LikeCoin

Github:github.com/likecoin

你可能感兴趣的:(Disabling Google App Engine disabled my site’s database)