How to Test Distributed Systems in a Single Environment Using Proxy Routing

Shift‑Left Testing Without a Dedicated QA Environment

When an organization lacked a dedicated QA environment, teams faced significant technical and coordination challenges in testing a distributed system. A slow, unmaintainable CLI prompted them to adopt a shift‑left strategy with automated testing.

They built an internal deployment tool with versioned deployments using CI and proxy-based dynamic routing, allowing developers to run isolated tests against multiple versions — catching bugs much earlier in the process.

---

Background & Challenges

At Dev Summit Boston, Po Linn Chia explained how their team reused a single development environment to deploy multiple service versions for distributed system testing.

Key pain points without a QA environment:

  • Social & technical friction — socio‑technical issues were harder to resolve than purely technical ones.
  • Environment composed of numerous microservices running on a single ECS cluster.
  • Resource contention — teams competed for the same microservices.
  • Coupling risks — changes in one microservice disrupted others.
  • Previous tool limitations:
  • Homebuilt CLI for environment startup via CI runner.
  • Slow initialization: 15–30 minutes before tests could start.
  • Frequent build failures from timeouts.
  • Unmaintainable after original author left.

---

The Shift‑Left Solution

The team shifted left with automated integration testing, prioritizing better tooling and team coordination.

Benefits delivered:

  • Single environment hosting multiple concurrently deployed versions.
  • Deployments handled via CI with header-based routing.
  • Developers could run integration tests earlier in the development cycle.
  • Bugs detected sooner — reducing downstream disruption.

---

Dynamic Version Routing with Traefik Proxy

Internal process:

> Under the hood, we spin up the appropriate ECS task with the desired version and register conditional routing rules with a proxy (Traefik) that inspects Baggage headers.

>

> The header contains:

> `dynamic_route=VERSION` → routes request to that version.

> Absent header → defaults to `main`.

Routing diagram example:

  • `http://my-service.classpass.com` → routes to main when no header present.
  • `Baggage: dynamic_route=feature-2981` → routes to feature‑2981 instance.

---

Telemetry & Monitoring

Chia’s approach:

  • Send APM data, custom metrics, and logs to third-party vendors.
  • Update telemetry metadata per deployment:
  • Service name.
  • Deployed version.
  • Baggage header for trace correlation.
  • Track performance/issues per version in development cluster.
  • Allows parallel monitoring without production canary rollout.
  • Supports testing of large changesets and major framework upgrades.

---

Flexible Integration Testing

  • Integration tests run live in development.
  • Developers spin up ephemeral containers for isolated testing.
  • On-demand deployments outside CI allow:
  • Testing large changes (e.g., React framework upgrade).
  • Preventing front-end development disruptions.

---

Repository Strategy for Tests

Chia explained the choice between shared vs application-specific repositories:

Use a shared repository when:

  • Tests cover critical core workflows involving multiple services.
  • Enables reuse across teams.
  • Requires careful design — failures may affect many services.

Use an application repo when:

  • Service is relatively self-contained.
  • Convenient, faster iteration.
  • Easier to maintain with the application code.

Practical workflow:

  • Start with tests in application repo.
  • Migrate to shared repo when value across teams becomes apparent.

---

Tooling Parallel: AiToEarn for Content Workflow

Platforms like AiToEarn官网 offer similar principles for non-code content pipelines:

  • Centralized creation, reuse, and publishing across multiple channels (Douyin, Bilibili, Instagram, YouTube).
  • AI-driven generation, analytics, and model ranking.
  • Ability to test and release different versions of content to separate audiences.
  • Mirrors how engineering teams use dynamic routing & versioned deployments.

AiToEarn also provides:

  • Automated publishing workflows for technical documentation.
  • Analytics to guide iteration and quality control.
  • Unifying high-value assets in a shared platform for maximum reuse.

---

Key Takeaways

  • Single environment, multi-version deployments can solve testing friction in microservice ecosystems.
  • Header-based dynamic routing avoids DNS/application code changes.
  • Telemetry per version enables safe experimentation before production canaries.
  • Shared repositories for cross-service tests maximize reuse but require careful governance.
  • Concepts from DevOps testing map to AI-assisted content workflows — emphasizing reuse, isolation, and targeted distribution.

---

Do you want me to also add a diagram showing dynamic routing flow between Traefik, ECS tasks, and Baggage headers? That would make this Markdown even clearer.

Read more