Leveraging Cursor in a Large-Scale Project: My First Experience

When I started at Epignosis, I folded Cursor into how I actually work, not as a novelty, but as something I reached for when the codebase outpaced my notes. It was the first time I relied on an AI editor for more than autocomplete in a real, high-complexity environment. What changed was not my typing speed; it was how I budget time between discovery and implementation.

The Context

Every codebase has a personality. History baked into it. Mine arrived as a large LMS: many modules, wires between them, and legacy paths you do not refactor for sport. Onboarding here is not “read the README and ship.” It is learning where the same word means different things depending on which folder you are in.

First Impressions

Cursor felt less like an autocomplete box and more like a second pair of eyes that could keep several files in view at once. The part that mattered for this post is not the suggestions themselves. It is that I could ask in plain language instead of chaining grep, IDE search, and hallway questions, and still know I had to verify everything myself. The rest of this article is about that balance.

My First Task: An Event in an Event-Driven Flow

My first assignment was to implement a new EDA (event-driven architecture) event: a message the system emits when something important happens, with consumers in other services or layers.

The questions piled up fast:

Where do handlers for similar events live?
How does the legacy stack publish or consume events compared to the newer REST API?
What breaks if the payload shape is wrong?

Here is where I stopped treating all search tools as interchangeable.

grep / ripgrep is unbeatable when you already know the string: a class name, a queue name, a constant. You get a list of hits, you open files, you stitch the story together yourself. It is honest work. It breaks down when the codebase uses five phrases for the same idea, or when the important word is buried in a string builder three calls deep.

IDE search (scoped to a folder, or “find usages” on a symbol) widens the net. It is still fundamentally text and symbols. You can filter by path and file type, which helps in a monorepo, but you are still guessing which symbol to anchor on. If you pick the wrong entry point, you spend an afternoon in the wrong neighborhood.

Cursor with a narrow mission was different in kind, not only in degree. I could ask for “where events like this are published and consumed,” across legacy and newer layers, and get a proposed map: files grouped by role, sometimes synonyms I would not have thought to search. That map was often wrong in the details. It was still a faster wrong than a slow blind search because I could correct it with the debugger and a close read, instead of discovering I had been searching the wrong word at hour three.

None of that replaces grep. I still use it every day. The point is to match the tool to the uncertainty: exact string → ripgrep; symbol you trust → IDE; fuzzy concept spanning stacks → assistant first, then verify.

A pattern I was looking for in existing handlers looked conceptually like this (dummy names and schema; only the shape matters):

{
  "event": "domain.course.created",
  "version": 1,
  "occurred_at": "2026-03-30T12:00:00Z",
  "tenant_id": "acme",
  "payload": {
    "course_id": "crs_01jqxyz",
    "actor_user_id": "usr_01abc"
  }
}

<?php
// Illustrative only, not production code

final class CourseCreatedPublisher
{
    public function publish(CourseCreated $event, EventBus $bus): void
    {
        $envelope = $this->envelopeFactory->fromDomainEvent($event);
        $bus->dispatch('domain.course.created', $envelope);
    }
}

Technically, what I cared about next was not “emit an event” in the happy path. It was contract compatibility: version in the envelope, whether consumers are at-least-once and therefore need idempotency keys, and what happens when one subscriber upgrades before another. I will say it plainly: event schemas are public APIs. Treat them as an afterthought and you will ship a breaking change that only shows up under load or in a downstream queue. Cursor helped me find where similar envelopes were assembled and validated; it did not write the idempotency strategy for me.

There was no single obvious entry point. I spent half a day in internal docs and file trees, useful but fragmented. That is when I switched from “read everything” to “tell the assistant what success looks like, then tear the answer apart,” with the same verification I would use after any senior’s sketch on a whiteboard.

From Vague Prompts to Missions

I started with a broad prompt:

Hi, what can you tell me about this project?

The answer was a reasonable map: major areas, how pieces relate. Fine for day one. It also showed me a rule I still use: vague questions get survey answers. They do not replace a goal.

So I reframed. Instead of “tell me about X,” I gave a mission with boundaries, stacks, folders, what “done” looks like. The next prompt looked like this:

Find every place user impersonation is implemented or checked. Include legacy PHP modules and the newer REST API code. List files and how they connect.

That shift, from open chat to scoped reconnaissance, is what made the tool feel earned instead of magical.

Case Study 1: Impersonation Across Legacy and REST

I kept the mission concrete: both legacy and REST API folders, and the interaction points between them, not a single happy path.

What came back was not a wall of prose. It read more like a reconnaissance brief: a short list of areas, then files grouped by role. In my own notes I distilled it into something like the structure below (names and paths are illustrative, not a copy-paste from our repo):

legacy/
  └── User/
      └── Impersonation*.php          # session / context switch
rest-api/
  └── src/
      └── Identity/
          └── ImpersonationGuard.php  # token + permission gate

Alongside that, it pointed to middleware or filters that attach identity to the request, the same layers I would have had to discover by stepping through with a debugger.

On the REST side, a simplified version of what I expected to find (and later verified line-by-line) looked like:

<?php
// Illustrative middleware, real code has more guards

namespace App\Http\Middleware;

final class AttachImpersonationContext
{
    public function handle(Request $request, Closure $next): Response
    {
        $token = $request->attributes->get('session_token');
        if ($token?->isImpersonating()) {
            $request->attributes->set('effective_user_id', $token->subjectId());
        }
        return $next($request);
    }
}

And a fragment of legacy-style session switching might resemble:

<?php
// Illustrative legacy helper

function tlms_begin_impersonation(int $adminId, int $targetUserId): void
{
    $_SESSION['impersonation'] = [
        'admin_id' => $adminId,
        'target_id' => $targetUserId,
        'started_at' => time(),
    ];
}

Seeing both shapes in the same exploration session made it obvious where parity checks belong.

How I verified it

Cursor gave me a hypothesis graph. I still:

Opened each suggested file and read the real control flow.
Set a breakpoint on the REST path and walked the same route in XDebug.
Compared: does legacy enforce the same invariants as the new API, or only some of them?

In one case the overview was slightly over-merged: two similarly named helpers were described as one flow when they served different entry points. That is not a reason to abandon the tool; it is a reason to treat its map as R&D output, not a spec.

Roughly, that first targeted round took on the order of minutes to produce a navigable list; doing the same with search keywords alone would have been hours of false positives and missed synonyms (“impersonate” vs “act as” vs “switch user”).

From a security angle, impersonation is where I am least willing to trust generated code. I want explicit invariants: who initiated the switch, whether it is time-bounded, whether audit logs record both identities, and whether APIs reject confused-deputy patterns. My view is that an assistant is fine for locating those invariants across stacks; it is not the authority on whether your threat model is complete. If the map says “middleware X,” I still read X and ask whether that is sufficient for every transport (browser session, API token, background job). That skepticism is not cynicism about the tool; it is how I sleep at night.

Case Study 2: When “Permissions” Means Five Different Things

While tracing features, I kept hitting the word Permissions. In a smaller codebase that might be one module. Here it could mean:

Layer	What “permission” often refers to
HTTP API	Route or scope checks on specific endpoints
Domain / service	Business rules (“may this user perform this action on this resource”)
RBAC	Roles and role-to-capability mapping
Product / feature flags	Gates that are not strictly authorization
Infra	Keys, environment, deployment not user auth

After company onboarding (product and engineering orientation, not a public certification name), I could name concrete actions “create user,” “update user,” “view user” but I still did not know where those checks were enforced relative to “create course,” which spans UI, legacy API, and newer flows.

So I asked:

Where is permission enforced so a user cannot create a course? Distinguish UI-only checks from API enforcement, and legacy vs newer paths.

What made this answer useful

The valuable part was not “here is a file.” It was layering: legacy UI affordances, legacy API handlers, and REST handlers, with an explicit call-out where behavior could diverge, for example UI hiding a button while an API still allows the operation if called directly. That is the class of bug you hunt when you care about consistency, not just about compiling.

Dummy examples of what “three layers” can look like in practice:

// Client may hide UI without enforcing server-side (illustrative)

const canCreateCourse = usePermission('courses.create');
return canCreateCourse ? <CreateCourseButton /> : null;

<?php
// Legacy API handler illustrative

public function postCreateCourse(CreateCourseRequest $req): JsonResponse
{
    if (!$this->acl->userMay($req->user(), 'courses.create')) {
        return response()->json(['error' => 'forbidden'], 403);
    }
    // ...
}

<?php
// REST policy illustrative

final class CreateCoursePolicy
{
    public function create(User $actor, Tenant $tenant): bool
    {
        return $this->capabilities->granted($actor, $tenant, 'courses.create');
    }
}

The point of the exercise was not to memorize snippets like these. It was to know which of them actually runs for the client I cared about.

I am opinionated about defense in depth: the UI should reflect policy, but the server must enforce it. If those two disagree, I consider it a defect unless there is a documented, intentional reason (for example, progressive enhancement with a degraded mode, which still needs a story for direct API access). In a brownfield LMS, “permission” often leaks into feature flags and product experiments too. I do not think those should be conflated with RBAC in code, even when marketing uses one word for all of them. Naming and module boundaries matter because the next engineer will grep for permission and land in the wrong layer.

Trade-Offs: Cursor vs Other Tools

None of these replace the others; they have different failure modes:

Approach	Strength	Weakness
`grep` / ripgrep	Exact symbol search, fast	Synonyms and indirect calls; no narrative
IDE “Find usages”	Refinement	Noise in huge codebases; misses dynamic dispatch
Debugger	Ground truth for one execution	Slow to cover all branches
Cursor (directed prompts)	Cross-file story and synonyms	Can over-merge or hallucinate edge paths

The workflow that worked for me: Cursor for a structured first pass, then the debugger and raw reading for proof. Cursor’s own documentation stresses context and rules; pairing that with repo-specific rules files (when your team maintains them) improves consistency.

Opinions I am willing to defend

Navigation beats codegen for onboarding. The highest leverage use of an AI editor in a large repo, for me, has been finding and relating code, not letting it draft whole features on day three. I would rather own fewer lines I understand than ship many I do not.
Context windows are a budget, not a miracle. Long chats drift. I restart threads when the task changes, and I pin concrete paths or symbols when I know them. Treating the assistant like a stateless search plus narrative layer keeps quality higher than pretending it remembers last week’s decision.
When grep wins: exact symbol renames, generated migrations, or a single known string across the repo. When Cursor wins: “this concept has five names and three frameworks.”
Telemetry still beats prose. If logs or traces show which component handled a request, that evidence outranks a confident paragraph from any model. I use AI to suggest where to add a log or breakpoint, not to replace runtime truth.

One technical detail: request identity

On the REST side, identity often flows through attributes populated by middleware (see the dummy AttachImpersonationContext earlier). That matters because authorization policies usually read the same effective user the domain services see. If those two disagree, you get bugs that look like “permissions are random.” When I explore, I explicitly ask how Request attributes, session state, and policy classes align. A boring question, but it prevents spectacular production incidents.

When I want a reproducible check after an exploratory chat, I sometimes leave a scratch assertion in a test or script (nothing that ships), just a guardrail for my own understanding:

<?php
// Spike / scratch: throw away after you trust the real integration

public function test_create_course_requires_capability(): void
{
    $this->actingAsUserWithout('courses.create');
    $response = $this->postJson('/api/v2/courses', ['title' => 'T']);
    $response->assertStatus(403);
}

A Small Playbook You Can Reuse Tomorrow

If you take one thing from this post, make it operational:

Name the subsystem (e.g. “impersonation,” “course creation permissions”).
Bound the search (legacy vs new, UI vs API).
Ask for layers and divergence points, not just file paths.
Verify with reads and, when it matters, a debugger.
Log wrong merges when the model conflates two flows. Those notes train your next prompt.

Closing

The hard part of a large system is rarely typing the implementation. It is knowing which layer owns a rule, whether two stacks agree, and what breaks downstream. Cursor did not hand me that understanding. It narrowed where to look and sharpened the questions I asked in code review and in my own head.

I use it as a reconnaissance tool: point, verify, then own the change. That shortened the distance between “new on the team” and “comfortable changing this.” That is the bar I care about for the next task too.

If one belief ties this post together: in a brownfield codebase, competence shows up as impact awareness, not as commit velocity. Tools that pull impact into view earlier are worth learning. The rest is packaging.