Appearance
Solr Search Indexing
Flow ID: SY-06 | Module(s): job, search | Complexity: High Last Updated: 2026-04-04
Business Overview
The platform uses Apache Solr for full-text product (and blog) search with Greek language support. The AdvSolrIndex job rebuilds the Solr index on a daily schedule, pushing all active products (and blogs in v2) as structured documents. The system supports two schema versions:
- v1: Product-only indexing with Greek-to-Latin transliteration fields for cross-script search.
- v2: Products plus blog articles, ICU tokenizer, custom Greek-Latin filter plugin (
GreekLatinTokenFilterFactory), n-gram suggestion fields, and a document modification stamp for safe atomic updates.
Schema creation is handled separately via a CLI command (php cli.php solr/createSolrSchema).
Architecture
AdvSolrIndex (job, daily)
|
+--> Check solrSearch config enabled
+--> Dispatch to version handler:
|
+--> v1: solr_model::indexData()
| +--> productsToIndex() SQL join query
| +--> formatData() document transform
| +--> solr_client::delete('*:*') wipe entire core
| +--> solr_client::update() push all documents
|
+--> v2: solr_model_v2::indexData()
+--> productsToIndex() product SQL query
+--> formatProductData() product document transform
+--> blogsToIndex() blog SQL query
+--> formatBlogData() blog document transform
+--> assignModificationStamp() tag all docs with hash
+--> solr_client::update() push all documents (upsert)
+--> solr_client::delete() remove stale docs by stampKey Files
| File | Role |
|---|---|
ecommercen/job/libraries/AdvSolrIndex.php | Job implementation |
application/modules/job/libraries/SolrIndex.php | Client-overridable subclass |
ecommercen/search/models/Adv_solr_model.php | v1 indexing model |
ecommercen/search/models/Adv_solr_model_v2.php | v2 indexing model (products + blogs) |
ecommercen/libraries/AdvSolrClient.php | HTTP client for Solr REST API |
ecommercen/search/controllers/Adv_solr.php | Schema creation CLI controller |
ecommercen/search/traits/WithSolrDocumentTypeTrait.php | Document ID generation/extraction |
ecommercen/libraries/SolrDocType.php | Document type enum (Product, Blog) |
application/config/app.php | Solr connection configuration |
Code Flow
v1 Indexing (Adv_solr_model::indexData)
Collect products via
productsToIndex():- Joins
shop_productwith MUI tables, vendor MUI, barcodes, product codes, and category MUI. - Filters:
soft_delete = false,active = true,price > 0. - Aggregates multi-language names, descriptions, barcodes, product codes, and category slugs using
GROUP_CONCATwith custom separators. - Enriches with sales count from
products_in_cart_model->getProductsWithNumberOfSales().
- Joins
Format documents via
formatData():- Each product becomes a Solr document with fields:
id,soft_delete,active,price,vendor_id,vendor_name,product_hits,product_sales,barcode[],product_code[],product_category[],product_name[],product_description[]. - Multi-valued fields are exploded from concatenated strings.
- Each product becomes a Solr document with fields:
Full re-index:
solr_client->delete('*:*')-- wipe all existing documents.solr_client->update(formattedProducts)-- push all documents withsoftCommit=true.
v2 Indexing (Adv_solr_model_v2::indexData)
Generate modification stamp:
md5(time())-- a unique hash for this indexing run.Collect and format products (same SQL as v1, plus
category_slugsandproduct_category_slugfield).- Document IDs are prefixed with type:
product:{id}viagenerateID(SolrDocType::Product, id). - Includes
doc_type: 'product'field.
- Document IDs are prefixed with type:
Collect and format blogs via
blogsToIndex():- Joins
blogwithblog_muifor multi-language titles, descriptions. - Document IDs:
blog:{id}. - Fields:
blog_title[],blog_small_description[],blog_description[],blog_date,blog_hits.
- Joins
Stamp all documents with
doc_modification_stamp.Atomic update:
solr_client->update(products + blogs)-- upsert all documents.solr_client->delete("-doc_modification_stamp:{stamp}")-- delete any document NOT stamped with the current run (removes stale entries without a full wipe).
Schema Creation (CLI)
The schema is created/updated via a separate CLI command, not during indexing:
bash
php cli.php solr/createSolrSchemaThis calls solr_model->createSchema() (v1) or solr_model_v2->createSchema() (v2), which POSTs the schema definition to Solr's Schema API.
The controller at ecommercen/search/controllers/Adv_solr.php is CLI-restricted (is_cli() || isAdvisableUser()).
Data Model
Solr Document Fields (v1)
| Field | Solr Type | Multi-valued | Source |
|---|---|---|---|
id | string (auto) | No | shop_product.id |
soft_delete | boolean | No | shop_product.soft_delete |
active | boolean | No | shop_product.active |
price | pfloat | No | shop_product.price |
product_name | text_el | Yes | shop_product_mui.name (copy target) |
product_name_original | text_el | No | Original name |
product_name_greektolatin | text_general | No | Transliterated name |
product_name_latintogreek | text_el | No | Reverse transliteration |
product_description | text_el | Yes | shop_product_mui.description (copy target) |
vendor_id | string | No | shop_product.vendor_id |
vendor_name | text_general | No | shop_vendor_mui.name |
product_hits | pint | No | shop_product.hits |
product_sales | pint | No | Aggregated from tmp_shop_order_basket |
barcode | text_en_splitting_tight | Yes | shop_product_barcodes.barcode |
product_code | text_en_splitting_tight | Yes | product_codes.product_code |
product_category | text_el | Yes | shop_product_category_mui.slug |
_text_ | text_el | Yes | Copy-field aggregate for full-text search |
Additional v2 Fields
| Field | Solr Type | Multi-valued | Source |
|---|---|---|---|
doc_type | string | No | product or blog |
doc_modification_stamp | string | No | MD5 hash per indexing run |
product_category_slug | text_el | Yes | Category slugs |
suggestions | text_el_suggest | Yes | Copy-field for autocomplete |
blog_title | text_el | Yes | blog_mui.title |
blog_small_description | text_el | Yes | blog_mui.small_description |
blog_description | text_el | Yes | blog_mui.description |
blog_date | pdate | No | blog.blog_date |
blog_hits | pint | No | blog.hits |
v2 Custom Field Types
| Type Name | Purpose |
|---|---|
text_el | ICU tokenizer + Greek lowercase + Greek-Latin filter + Greek stemmer |
text_el_suggest | N-gram tokenizer (3-15) for autocomplete suggestions |
text_product_code | Whitespace tokenizer + pattern replace + word delimiter + n-gram for barcode/SKU search |
v2 Copy Fields
| Source | Destination | Purpose |
|---|---|---|
vendor_name, product_name, product_category, blog_title | suggestions | Autocomplete |
vendor_name, product_name, product_description, product_category, product_category_slug, blog_title, blog_small_description, blog_description | _text_ | Full-text search |
Source MySQL Tables
| Table | Content |
|---|---|
shop_product | Products (filtered: active, not deleted, price > 0) |
shop_product_mui | Product multi-language names and descriptions |
shop_vendor_mui | Vendor names |
shop_product_barcodes | Product barcodes |
product_codes | Product SKU codes |
shop_product_category_lp | Product-to-category relationships |
shop_product_category_mui | Category slugs and names |
blog | Blog articles (v2 only) |
blog_mui | Blog multi-language content (v2 only) |
tmp_shop_order_basket | Product sales aggregation |
Configuration
Job Scheduling (application/config/jobs.php)
php
['command' => 'SolrIndex', 'schedule' => '30 5 * * *', 'graceTime' => 300, 'retryTimes' => 3]Runs daily at 05:30 in the core queue.
Solr Connection (application/config/app.php)
php
$config['solrSearch'] = [
'enabled' => env('APP_SOLR_ENABLED', false),
'protocol' => env('APP_SOLR_PROTOCOL', 'http'),
'timeout' => env('APP_SOLR_TIMEOUT', 5),
'connect_timeout' => env('APP_SOLR_CONNECT_TIMEOUT', 5),
'host' => env('APP_SOLR_HOST', 'localhost'),
'port' => env('APP_SOLR_PORT', '8983'),
'core' => env('APP_SOLR_CORE', 'eshop'),
'auth' => [
'enabled' => env('APP_SOLR_AUTH_ENABLED', false),
'user' => env('APP_SOLR_AUTH_USER', 'user'),
'pass' => env('APP_SOLR_AUTH_PASS', 'pass')
],
'version' => 'v2'
];Environment Variables
| Variable | Default | Description |
|---|---|---|
APP_SOLR_ENABLED | false | Master toggle for Solr integration |
APP_SOLR_PROTOCOL | http | Connection protocol |
APP_SOLR_HOST | localhost | Solr server hostname |
APP_SOLR_PORT | 8983 | Solr server port |
APP_SOLR_CORE | eshop | Solr core name |
APP_SOLR_TIMEOUT | 5 | HTTP request timeout (seconds) |
APP_SOLR_CONNECT_TIMEOUT | 5 | HTTP connection timeout (seconds) |
APP_SOLR_AUTH_ENABLED | false | Enable Basic auth |
APP_SOLR_AUTH_USER | user | Basic auth username |
APP_SOLR_AUTH_PASS | pass | Basic auth password |
Solr API Endpoints Used
| Operation | Endpoint |
|---|---|
| Schema update | api/cores/{core}/schema |
| Delete documents | solr/{core}/update?commit=true |
| Index documents | solr/{core}/update?softCommit=true |
| Core status | solr/admin/cores?action=STATUS&core={core} |
| Search | solr/{core}/select |
Client Extension Points
Override the job class: Extend
AdvSolrIndexinapplication/modules/job/libraries/SolrIndex.phpto add custom document types or modify the indexing strategy.Override the Solr model: Extend
Adv_solr_modelorAdv_solr_model_v2inapplication/modules/search/models/to:- Add custom fields to the schema
- Modify
productsToIndex()to include additional product attributes - Add new document type methods (e.g., events, landing pages)
- Change the
formatData()/formatProductData()/formatBlogData()document structure
Override the schema: Pass a custom schema array to
createSchema($overrideSchema)from a controller override.Override the Solr client: Extend
AdvSolrClientto add custom HTTP middleware, logging, or error handling.Switch schema versions: Change
'version' => 'v1'inapp.phpto use the simpler v1 schema (products only, no blogs, no modification stamp).
Business Rules
| Rule | Description |
|---|---|
| Feature gated | Job exits immediately if solrSearch.enabled is not true |
| Full re-index (v1) | v1 wipes the entire core before re-indexing (brief search downtime) |
| Atomic update (v2) | v2 uses modification stamp to remove stale docs without full wipe |
| Active products only | Only products with soft_delete=false, active=true, price>0 are indexed |
| Multi-language support | All language variants are indexed as multi-valued fields |
| Greek-Latin cross-script | v1 uses explicit transliteration fields; v2 uses the GreekLatinTokenFilterFactory plugin |
| Autocomplete (v2) | Copy fields feed n-gram suggestions for typeahead search |
| Blog indexing (v2) | Blog articles are indexed alongside products with doc_type discrimination |
| Sales ranking | Product sales counts are enriched from the order basket analytics table |
| Schema managed separately | Schema creation is a manual CLI step, not part of the indexing job |
Related Flows
- SY-01 Cron Job Framework -- job scheduling and execution
- CF-04 Search -- storefront search using Solr
- AD-45 Search Debug -- admin tool for debugging Solr queries
- IN-10 Doofinder -- alternative external search integration
Wiki Guide: Solr Setup Guide -- full Solr installation, configuration, and schema management guide