# Tony Meng

[tony@tonymeng.com](mailto:tony@tonymeng.com)&nbsp;•&nbsp;[github.com/tonymeng](https://github.com/tonymeng)&nbsp;•&nbsp;[linkedin.com/in/tmeng](https://www.linkedin.com/in/tmeng/)

B.S. Computer Science — The University of Texas at Austin

## Technical Skills

- **Infrastructure & Cloud:** AWS, GCP, Terraform, Kubernetes (EKS), Helm, Istio, NGINX
- **CI/CD & Automation:** GitHub Actions, Harness CD, Spinnaker, Backstage
- **Observability:** SLOs, Datadog, Prometheus, Grafana, StatsD, Statsite, BigQuery
- **Data & Storage:** Postgres (BDR on EC2), RDS, PgBouncer, MongoDB, Solr, Zookeeper

## Experience

### Brain Company — AI Platform Engineer <span class="dates">May 2026 – Current</span>

### ClickUp — Infrastructure Architect <span class="dates">Jun 2022 – May 2026</span>

- Built an AI-agent system executing 14k manual test cases on demand, turning manual QA into elastic capacity.
- Cut new-service and new-infra provisioning time by 99% via a Backstage-driven IDP on centrally managed Terraform.
- Standardized compute across 28 EKS clusters, 5 regions, 6 envs (Helm, Terraform); reduced compute spend 50%.
- Designed geo-homing for ClickUp's data layer: localized residency for millions of workspaces, −65% storage cost, −99% setup time.
- Drove the storage-layer migration, unlocking >400% headroom for data growth.
- Led the infrastructure integration of an acquired company.

### Salesforce — Software Engineering Architect <span class="dates">Apr 2020 – Jun 2022</span>

- Owned infra architecture for 100+ services and ~200 engineers across Search & AI.
- Standardized compute (EKS) and service mesh / mTLS (Istio) across 50+ clusters in 10+ regions.
- Migrated stateful systems (MongoDB, Zookeeper, Solr) to Kubernetes.
- Standardized observability (Datadog → Argus) and drove org-wide SLO adoption.

### Medium — Staff Engineer / SRE <span class="dates">Aug 2019 – Jan 2020</span>

- Led SLO implementation and monitoring across backend and infra teams.
- Standardized distributed on-call, removing manual triage as a reliability tax.

### Google / Firebase — L5 Software Engineer / SRE <span class="dates">Mar 2014 – Apr 2019</span>

- Scaled Firebase Realtime Database client capacity 10× by [redesigning NGINX-based load balancing](https://firebase.googleblog.com/2017/04/increasing-realtime-database.html).
- Implemented GDPR compliance ("Wipeout") across 50+ services and built a Kubernetes-based load generator simulating 10M+ concurrent clients.
- Built terabyte-scale product and operational metrics pipelines (StatsD, Statsite, BigQuery, Graphite, Grafana).

### Salesforce — Senior Member of Technical Staff <span class="dates">Jan 2011 – Mar 2014</span>

- Designed and shipped [Chatter Trending Topics](https://www.salesforce.com/ap/company/news-press/press-releases/2013/03/130320/) and collaborative-filtering recommendations.
- Designed and shipped [Chatter Knowledgeable Users](https://www.salesforce.com/ap/company/news-press/press-releases/2013/04/130405/) via cosine-based collaborative filtering.
