I vectorized my plugin (and you can too!)

Codanna builds a vectorized map of your code, so AI tools can find things better. It works great with large WordPress plugins 🤩

I’ve been working on a bunch of AI stuff lately, and I’m starting to think about problems I have in WordPress-land that can benefit from what I’m learning in AI-land.

When somebody sent me a link to Codanna yesterday, I was intrigued.

With the memory project, we’ve been doing a lot of work with embeddings (will get to that in a sec).

Codanna is a tool that allows you to generate embeddings based on your code. And because it supports PHP, it supports WordPress plugins.

What’s an embedding?

If you hang out around AI water coolers you may hear terms like embeddings or vector search.

An embedding is a way of turning a piece of text into a set of numbers that represent the concepts in that text.

Think of an embedding as coordinates in “meaning space.” Similar concepts are close together, different concepts are far apart.

Two embeddings can then be compared mathematically to see how conceptually similar the two pieces of text are.

An example from WP Fusion: Say we’re working with our AI code editor (like Claude Code or Cursor), and we want to know where in WP Fusion’s plugin code timezone conversions happen.

We might search the files for some combination of “timezone”, “conversion”, “date handling”, “format field”, and see what turns up.

Then we’d open those files, look at the code, and we’d get a pretty good understanding of how WP Fusion handles time zones across the various CRMs.

This is also what “agents” like Claude Code or Cursor are doing when they start a task— they perform text searches for keywords, then they expand their context around anywhere those keywords show up in the codebase. They do it way faster than a human could.

Step 1: Search for "timezone conversion"
        ↓
Step 2: Grep/text match - find exact words
        ↓
Step 3: Found in file: class-api.php
        Lines 145-167 contain "timezone"
        ↓
Step 4: Expand context around match
        Read lines 130-180 (±15 lines)

This works pretty well, but it only works for those keywords. Something related but not matching exactly, for example, a function in the code that handles “UTC offset”, would be completely overlooked.

Embeddings to the rescue

For illustration, we’re using 12 “dimensions” (real embeddings have 384+ dimensions!). Each number represents different semantic concepts:

Legend:

🔴 Red (High values): Strong presence of this concept
🟡 Yellow (Medium values): Moderate presence
⚪ White (Low values): Weak/absent concept

Search query

Our search phrase can be converted to a 12-dimensional embedding like so.

"timezone conversion date handling format field value"

Vector: [0.89, 0.82, 0.91, 0.67, 0.78, 0.45, 0.23, 0.71, 0.88, 0.34, 0.19, 0.52]

Dimensions:      [  1     2     3     4     5     6     7     8     9    10    11    12  ]
Concepts:        [time  date  format conv  field  type  class  value trans  api  code  doc ]
Values:          [🔴0.89 🔴0.82 🔴0.91 🟡0.67 🔴0.78 🟡0.45 ⚪0.23 🟡0.71 🔴0.88 ⚪0.34 ⚪0.19 🟡0.52]

Key concepts: Strong signals for time, date, format, field, value, and transformation.

Results in the code

Here are the relevant search results in the WP Fusion code, as vectors:

Result 1:

/**
 * Gets the default datetime format for syncing with the CRM.
 *
 * @since 3.37.27
 *
 * @return string The date time format.
 */
public function get_crm_datetime_format() {

Vector: [0.85, 0.90, 0.93, 0.51, 0.62, 0.48, 0.18, 0.55, 0.73, 0.41, 0.15, 0.68]

Dimensions:      [  1     2     3     4     5     6     7     8     9    10    11    12  ]
Concepts:        [time  date  format conv  field  type  class  value trans  api  code  doc ]
Search Query:    [🔴0.89 🔴0.82 🔴0.91 🟡0.67 🔴0.78 🟡0.45 ⚪0.23 🟡0.71 🔴0.88 ⚪0.34 ⚪0.19 🟡0.52]
Result 1:        [🔴0.85 🔴0.90 🔴0.93 🟡0.51 🟡0.62 🟡0.48 ⚪0.18 🟡0.55 🟡0.73 🟡0.41 ⚪0.15 🟡0.68]
Overlap:         [  ✓     ✓     ✓     ✓     ~     ✓     ✓     ~     ~     -     ✓     -  ]

Match: 87%

Why it matches: Check out dimensions 1, 2, and 3 (time, date, format) – nearly identical values. The function name contains “datetime” and “format” which captures exactly what we’re looking for. Even though we searched for “timezone conversion”, the AI understood that “datetime format” is semantically the same concept.

Result 2:

/**
 * Format field values to match HighLevel formats.
 *
 * @since 3.37.11
 *
 * @param string $value The value.
 * @param string $field_type The field type.
 * @param string $field The CRM field.
 */
public function format_field_value( $value, $field_type, $field ) {

Vector: [0.72, 0.68, 0.95, 0.44, 0.91, 0.76, 0.31, 0.89, 0.62, 0.38, 0.22, 0.58]

Dimensions:      [  1     2     3     4     5     6     7     8     9    10    11    12  ]
Concepts:        [time  date  format conv  field  type  class  value trans  api  code  doc ]
Search Query:    [🔴0.89 🔴0.82 🔴0.91 🟡0.67 🔴0.78 🟡0.45 ⚪0.23 🟡0.71 🔴0.88 ⚪0.34 ⚪0.19 🟡0.52]
Result 2:        [🟡0.72 🟡0.68 🔴0.95 🟡0.44 🔴0.91 🟡0.76 ⚪0.31 🔴0.89 🟡0.62 ⚪0.38 ⚪0.22 🟡0.58]
Overlap:         [  ~     ~     ✓     ~     ✓     -     ✓     ✓     ~     -     ✓     ✓  ]

Match: 84%

Why it matches: Look at dimensions 3, 5, and 8 (format, field, value) – these are really strong. This function is all about formatting field values, which is exactly what you’d need for timezone handling. Notice it scores high on “field” (0.91) and “value” (0.89), matching our search intent even though the words “timezone” and “conversion” don’t appear anywhere in the docblock.

Result 3:

/**
 * Format field value.
 *
 * Formats outgoing data to match CRM field formats.
 * This will vary depending on the data formats accepted by the CRM.
 *
 * @since 3.40.42
 *
 * @param mixed $value The value.
 */
public function format_field_value( $value ) {

Vector: [0.66, 0.71, 0.94, 0.58, 0.88, 0.69, 0.28, 0.87, 0.71, 0.36, 0.24, 0.63]

Dimensions:      [  1     2     3     4     5     6     7     8     9    10    11    12  ]
Concepts:        [time  date  format conv  field  type  class  value trans  api  code  doc ]
Search Query:    [🔴0.89 🔴0.82 🔴0.91 🟡0.67 🔴0.78 🟡0.45 ⚪0.23 🟡0.71 🔴0.88 ⚪0.34 ⚪0.19 🟡0.52]
Result 3:        [🟡0.66 🟡0.71 🔴0.94 🟡0.58 🔴0.88 🟡0.69 ⚪0.28 🔴0.87 🟡0.71 ⚪0.36 ⚪0.24 🟡0.63]
Overlap:         [  ~     ~     ✓     ~     ✓     -     ✓     ✓     ~     -     ✓     ✓  ]

Match: 83%

Why it matches: Nearly identical pattern to Result 2. The key phrase here is “data formats accepted by the CRM” – dates and timezones are always a formatting challenge when syncing to different CRMs. The embedding picked up on this context without needing the literal word “timezone”.

Result 4:

/**
 * Get ISO 8601 date.
 *
 * Takes an input string and returns an ISO 8601 compatible date for API calls,
 * for example 2024-04-24T00:18:01+08:00.
 *
 * @since 3.43.6
 *
 * @param int|string $timestamp The timestamp to convert (in GMT by default).
 */
function get_iso_8601_date( $timestamp ) {

Vector: [0.91, 0.88, 0.87, 0.79, 0.43, 0.39, 0.21, 0.51, 0.81, 0.52, 0.17, 0.71]

Dimensions:      [  1     2     3     4     5     6     7     8     9    10    11    12  ]
Concepts:        [time  date  format conv  field  type  class  value trans  api  code  doc ]
Search Query:    [🔴0.89 🔴0.82 🔴0.91 🟡0.67 🔴0.78 🟡0.45 ⚪0.23 🟡0.71 🔴0.88 ⚪0.34 ⚪0.19 🟡0.52]
Result 4:        [🔴0.91 🔴0.88 🔴0.87 🟡0.79 🟡0.43 ⚪0.39 ⚪0.21 🟡0.51 🔴0.81 🟡0.52 ⚪0.17 🟡0.71]
Overlap:         [  ✓     ✓     ✓     ✓     ~     ✓     ✓     ~     ✓     -     ✓     -  ]

Match: 79%

Why it matches: This is the money shot. Look at dimensions 1, 2, 3, and 9 (time, date, format, transformation) – nearly perfect alignment. The docblock mentions “GMT by default” and shows a timezone offset example (+08:00). This is explicitly about timezone handling, and the embedding knows it. Even though our search didn’t mention “ISO 8601” or “GMT”, the semantic meaning matched perfectly.

What Traditional Search Would Have Found

Here’s the kicker. If you did a traditional grep search for “timezone conversion”:

❌ Result 1: No mention of “timezone”
❌ Result 2: No mention of “timezone”
❌ Result 3: No mention of “timezone”
✅ Result 4: Mentions “GMT” and timezone offset

Traditional search would have missed 75% of the relevant code. Vector search found it all because it understood the meaning, not just the keywords.

Codanna

Codanna “vectorizes” your code.

WP Fusion is a big plugin. With 100+ possible plugin integrations, it has about 200,000 lines of code. Codanna has vectorized that into a database file of about 20MB (which is a lot of text 🤯).

We then use a Cursor rule to tell the AI when it should use a vector search over a filesystem-based search.

You can grab our codanna.mdc rule here.

Practical example

WP Fusion’s MemberPress integration is quite complex when it comes to recurring payments. As an example I asked a pretty broad question

How do we handle MemberPress subscription cancellations?

You can see in the screenshot the agent first recalls memories (more on that in a bit), then runs the semantic_search_with_context tool, and the find_symbol tool rather than searching the filesystem.

We can expand the “thinking” and the tool calls in the chat to peek under the hood:

Screenshot

The user is asking about how WP Fusion handles MemberPress subscription cancellations. This is a strategic/architectural question about integration behavior.

According to the memory rules, I should:

ALWAYS recall for integration or API work (check past implementations)

This is about “How does X work?” which is a strategic question that benefits from memory context

The user is asking about a specific integration (MemberPress) behavior

I should also use Codanna according to the rules:

This is a “How does X work?” question → Use mcp_codanna_semantic_search_with_context

I should run memory recall and Codanna in parallel for comprehensive context

The semantic search tool then surfaces not only the MemberPress subscriptions functionality, but also the logic around expired MemberPress transactions (which is related to subscription cancellation, but a separate process).

It also finds the code that handles propagation effects to sub-account members in the MemberPress Corporate Accounts addon, and for comparison: the subscription cancellation logic in WooCommerce Subscriptions and Easy Digital Downloads.

The full response:

MemberPress Subscription Cancellations

WP Fusion handles cancellations through two parallel systems:

1. Subscription Status Changes

Hook: mepr_subscription_transition_status
Method: subscription_status_changed() in class-memberpress-subscriptions.php:114

When status → cancelled (line 219):

$apply_tags = array_merge( $apply_tags, $settings['apply_tags_cancelled'] );

If subscription is expired (line 208):

$remove_tags = array_merge( $remove_tags, $settings['tag_link'] );

$apply_tags = array_merge( $apply_tags, $settings['apply_tags_expired'] );

if ( ! empty( $settings['remove_tags'] ) ) {

     $remove_tags = array_merge( $remove_tags, $settings['apply_tags_registration'] );

}

2. Transaction Expiration

Hook: mepr-transaction-expired
Method: transaction_expired() in class-memberpress-transactions.php:544

Removes linked tags, optionally removes registration tags, applies expired tags.

Smart Protections

Concurrent subscriptions (line 242-286): Won’t remove tags if user has another active subscription configured with those same tags
Same product check (line 560): Won’t process if user still has active subscription to same product
Loop prevention (line 309, 574): Temporarily removes wpf_tags_modified hook to prevent re-triggering membership additions

Key Insight from Memory

Based on the stored pattern: MemberPress behaves differently for grouped vs non-grouped memberships. Grouped memberships with upgrade paths create fallback transactions that trigger mepr-txn-status-complete naturally. Non-grouped memberships don’t create fallback transactions—they just update the mepr_members table, requiring manual field syncing in transaction_expired().

Without Codanna

Without Codanna, the agent correctly identifies the file class-memberpress-subscriptions.php in the MemberPress folder and explains how the subscription cancellation functionality works within that class. But it doesn’t look at the transaction cancellation and expiration functionality or the subscription cancellation functionality that cascades into the Paid Memberships Pro add-ons like Corporate Accounts.

Setting It Up (Fair Warning)

You do need a little bit of technical knowledge to set this up. You need to:

Clone the repo and install dependencies
Index your codebase (took about 20 minutes for WP Fusion)
Configure it with your AI tool of choice
Probably curse a few times when it doesn’t work the first time

But once it’s running? 😘👌

Is This Worth Your Time?

Setting up Codanna took me about 20 minutes, and I’m pretty technical. The documentation is minimal, and you’ll need to be comfortable with npm and command line tools. If you’ve never indexed a codebase before, budget some extra time for troubleshooting.

It proved its worth that same day when debugging a similarly thorny issue involving MemberPress free transactions and automatic downgrades. With Codanna, the Cursor agent was able to clearly explain what status changes and tag updates we should expect in very specific user scenarios.

For a codebase like WP Fusion (200,000 lines across 400+ files), this makes a significant difference. Each CRM has its own date format requirements, subscription models, and edge cases. Traditional search requires you to already know the right keywords. Vector search helps when you only know the problem.

P.S. – I’m also using AutoMem alongside Codanna to remember past debugging sessions. It’s like giving your AI long-term memory about all the weird decisions and fixes you’ve made. Whole other rabbit hole though… 🐰

P.P.S. – Speaking of AutoMem: if Codanna gives you spatial awareness (where is X in the codebase?), AutoMem gives you temporal awareness (what did we decide last week? why did we do it that way?). It’s a persistent memory layer for AI assistants that works across Claude, Cursor, ChatGPT—basically anything that supports MCP. Powered by FalkorDB (graph database) + Qdrant (vector search), it stores memories with relationships and retrieves them semantically. We built it because we got tired of re-explaining the same architectural decisions every conversation. Check out the main project or the MCP server if you’re interested. One-command setup. Works great with with Codanna. 🧠