Appearance
Are you an LLM? You can read better optimized documentation at /flows/integration/IN-16-content-embeddings.md for this page in Markdown format
Content Embeddings / Smart Search
Flow ID: IN-16 | Module(s):
src/ContentEmbeddings/| Complexity: Medium
Business Overview
Content Embeddings is an internal Ecommercen feature that enables rich text content (blog posts, page builder blocks, CMS pages) to embed live product displays and product list sliders directly within their HTML content. Editors insert special reference tokens into rich text fields (via TinyMCE), and the system extracts, resolves, and renders those references into interactive product components at render time.
This is not AI-based search or vector embeddings -- it is a regex-based content reference extraction and hydration system that replaces shortcodes with rendered product HTML.
Reference Token Format
[[ref_product:42]] -> Single product display
[[ref_product:42,55,88]] -> Multiple products (slider/grid)
[[ref_product_list:7]] -> Product list (dynamic collection)
[[ref_product_list:7,12]] -> Multiple product listsPattern: [[ref_{type}:{id(s)}]] where IDs are comma-separated integers.
API Reference
Extraction API
| Class | Method | Parameters | Returns |
|---|---|---|---|
ContentEmbeddingsExtractor | extract($content) | string (HTML content) | ContentEmbeddings object |
ContentEmbeddingsExtractor | getPattern($type, $value) | ?string, mixed | Regex pattern string |
ContentEmbeddings Value Object
| Method | Parameters | Returns | Description |
|---|---|---|---|
addEmbedding($type, $idOrIds) | string, string | void | Adds parsed embedding (type-validated) |
getEmbeddingsOfType($type, $unique) | string, bool | array | Gets embeddings by type |
isEmpty() | -- | bool | True if no embeddings found |
isEmbeddingTypeAvailable($type) | string | bool | Validates embedding type |
Hydration API
| Class | Method | Parameters | Returns |
|---|---|---|---|
ContentEmbeddingsHydrator | hydrate($content, $options, $extra) | string, array, array | Hydrated HTML string |
ProductHydrator | hydrate($content, $embeddings, $options, $extra) | string, ContentEmbeddings, array, array | Content with products rendered |
ProductListHydrator | hydrate($content, $embeddings, $options, $extra) | string, ContentEmbeddings, array, array | Content with product lists rendered |
Supported Embedding Types
| Constant | Value | Description |
|---|---|---|
EMBEDDING_TYPE_PRODUCT | product | Individual product reference(s) |
EMBEDDING_TYPE_PRODUCT_LIST | product_list | Product list/collection reference(s) |
Code Flow
Extraction Phase
ContentEmbeddingsExtractor::extract($content)
-> mb_ereg_search_init($content, regex_pattern)
-> Pattern: \[\[ref_([a-zA-Z][a-zA-Z_-]+[a-zA-Z]):(\d+(,\d+)*)\]\]
-> For each match:
-> $type = match[1] (e.g., "product", "product_list")
-> $ids = match[2] (e.g., "42" or "42,55,88")
-> ContentEmbeddings::addEmbedding($type, $ids)
-> Validates type against whitelist (product, product_list)
-> Splits IDs by comma, converts to int array
-> Groups by type: embeddings[type][] = [id, id, ...]
-> Returns ContentEmbeddings objectHydration Phase
ContentEmbeddingsHydrator::hydrate($content, $options, $extraData)
-> Extract embeddings from content
-> If empty, return content unchanged
-> For each registered hydrator type:
-> ProductHydrator or ProductListHydrator
-> hydrator->hydrate($content, $embeddings, $options, $extraData)
-> Extract type-specific embeddings from content
-> For each embedding set:
-> Look up cached data from $options[type]
-> Build HydratedContentData (bidirectional state container)
-> Render view template via CI loader
-> Replace regex pattern in content with rendered HTML
-> Strip any unresolved embeddings of this type
-> Return modified contentProduct Hydration Detail
ProductHydrator::hydrate(...)
-> extractEmbeddings($content) (re-extracts from current content state)
-> For each product embedding set (e.g., [42, 55]):
-> Look up products in $options['product'] (pre-cached indexed data)
-> Create HydratedContentData with:
- productDataSet, liveData, productCodes, productCodeImages
- productIds, allPageProducts reference
-> Render view: {template}/components/content_embeddings/product
-> Merge rendered product data back into allPageProducts
-> Replace [[ref_product:42,55]] with rendered HTML
-> Strip any remaining unresolved product embeddingsProduct List Hydration Detail
ProductListHydrator::hydrate(...)
-> extractEmbeddings($content)
-> For each product list embedding set (e.g., [7]):
-> Look up product list data in $options['product_list']
-> Extract product IDs from each list's products
-> Create HydratedContentData with:
- productListDataSet, productIds by list
- liveData, productCodes, productCodeImages, allPageProducts
-> Render view: {template}/components/content_embeddings/product_list
-> Replace [[ref_product_list:7]] with rendered HTML
-> Strip remaining unresolved product_list embeddingsArchitecture
src/ContentEmbeddings/
Extraction/
ContentEmbeddingsExtractor.php # Regex-based token extraction
ContentEmbeddings.php # Value object holding parsed embeddings
Hydration/
ContentEmbeddingsHydrator.php # Orchestrator: extraction + hydrator dispatch
AbstractHydrator.php # Base class: pattern matching, replacement, cleanup
HydratorInterface.php # Contract for hydrators
ProductHydrator.php # Renders product embeddings
ProductListHydrator.php # Renders product list embeddings
HydratorData.php # Shared state container (liveData, productCodes, template, CI instance)
HydratedContentData.php # Bidirectional state for view rendering
application/views/main/components/content_embeddings/
product.php # Product embedding view template
product_list.php # Product list embedding view template
assets/main/scss/components/
_content_embeddings.scss # Styling for embedded components
tests/Unit/ContentEmbeddings/Extraction/
ContentEmbeddingsExtractorTest.php # Extraction unit tests
ContentEmbeddingsTest.php # Value object unit testsClass Hierarchy
HydratorInterface
<- AbstractHydrator (base: pattern matching, replacement, cleanup)
<- ProductHydrator (template: content_embeddings/product)
<- ProductListHydrator (template: content_embeddings/product_list)Key Design Decisions
- Two-phase architecture: Extraction and hydration are separated. Extraction is pure regex (no DB calls), while hydration requires product data lookups and template rendering.
- Pre-cached data: Product and product list data is passed via
$optionsparameter, meaning the calling code (typically the front controller) pre-fetches all needed data in a single batch query. - Bidirectional state:
HydratedContentDataallows views to modify state (especiallyallPageProducts) that flows back to the hydrator, enabling product tracking across multiple embedded components on the same page. - Template-aware: Views are resolved via the active template's component directory, supporting per-theme rendering.
- Graceful degradation: Unresolved embeddings (referencing deleted products or invalid IDs) are stripped from the content rather than displayed as raw tokens.
Data Model
No Dedicated Tables
Content embeddings do not have their own database tables. The reference tokens live inline within existing content fields (e.g., blog_mui.description, builder block content). Product and product list data is loaded from the standard shop_product and product list tables.
Integration Points
| Content Source | Where Tokens Appear |
|---|---|
| Blog posts | blog_mui.description (rich text) |
| Page builder blocks | Block content fields |
| CMS pages | Rich text content areas |
Configuration
TinyMCE Integration
The TinyMCE rich text editor (application/views/admin/utils/tinyMCE.php) includes a custom button/plugin for inserting content embedding tokens. Administrators select products or product lists from a picker, and the editor inserts the [[ref_product:...]] or [[ref_product_list:...]] token.
Template Views
Each storefront template provides its own component views:
{template}/components/content_embeddings/product.php{template}/components/content_embeddings/product_list.php
PurgeCSS
The _content_embeddings.scss styles are included in the main PurgeCSS profile (build/purgecss_profiles/main/_commons.js) to ensure styles are not stripped during production builds.
Client Extension Points
Custom Embedding Types
The system currently supports two types (product, product_list). To add a custom type:
- Add the type constant to
ContentEmbeddings::isEmbeddingTypeAvailable() - Create a hydrator class extending
AbstractHydrator - Register the hydrator in
ContentEmbeddingsHydrator::setupHydrators() - Create the view template in the storefront template directory
In a client repo, this would require overriding ContentEmbeddingsHydrator via DI or creating a Custom\ContentEmbeddings\ namespace.
Custom Templates
Override the product/product list view templates in the client's storefront template to customize the rendering of embedded products (layout, styling, additional data).
Custom Data Resolution
The calling code controls what data is passed in $options. Client repos can modify the front controller to include additional pre-fetched data (e.g., inventory status, promotional flags) that views can then access.
Business Rules
- Type whitelist: Only
productandproduct_listtypes are accepted. Unknown types (e.g.,[[ref_banner:5]]) are silently ignored during extraction. - ID validation: IDs must be positive integers. Non-numeric values in the ID position will not match the regex pattern.
- Multiple IDs: A single token can reference multiple items (comma-separated). For products, this typically renders a slider or grid. For product lists, multiple lists are rendered sequentially.
- Graceful cleanup: After hydration, any remaining unresolved tokens of processed types are stripped from the output. This handles cases where referenced products have been deleted or deactivated.
- Page product tracking: All embedded products are tracked in
allPageProductsfor analytics purposes (e.g., product impression tracking). This accumulates across all embedding instances on a page. - Rendering order: Product embeddings are processed before product list embeddings (based on hydrator registration order in
setupHydrators()). - Template resolution: View paths are resolved dynamically based on the active storefront template, allowing different visual presentations per theme.
Related Flows
- CF-26 Home Page -- Page builder blocks that may contain content embeddings
- CF-02 Product Detail -- Product data resolution for embedded references
- IN-01 Feed Generation -- Product data model shared with feed system