Add WordPress NLWeb Plugin by rvguha · Pull Request #404 · nlweb-ai/NLWeb

rvguha · 2026-02-25T21:54:15Z

Summary

This PR adds a WordPress plugin implementation of the NLWeb v0.55 protocol, enabling WordPress sites to provide conversational search powered by LLMs.

Features

Core Functionality

NLWeb v0.55 Protocol: Full support for the NLWeb protocol specification (query, context, prefer, meta sections)
Query Fanout: Automatically rewrites complex natural language queries into 1-5 simpler keyword queries for better retrieval
Decontextualization: Handles multi-turn conversations by contextualizing queries based on conversation history
LLM-based Ranking: Scores search results 0-100 for relevance with AI-generated descriptions

Integrations

Multiple LLM Providers: OpenRouter, Anthropic, and Gemini support
MCP Apps: Sandboxed iframe rendering with postMessage bridge for interactive applications
WordPress Integration: Uses native WordPress search (WP_Query) with Yoast SEO schema.org metadata

User Interface

ChatGPT-style Interface: Modern conversational UI with message bubbles, avatars, typing indicators
Debug Panel: Collapsible panel showing pipeline execution (decontextualization, query fanout, retrieval, ranking)
Responsive Design: Mobile-friendly with customizable styling

Plugin Structure

code/wordpress/nlweb/
├── nlweb.php                          # Main plugin file, REST endpoints
├── assets/
│   ├── nlweb-frontend.css             # ChatGPT-style interface styles
│   ├── nlweb-frontend.js              # Frontend JavaScript
│   └── nlweb-mcp-renderer.js          # MCP Apps renderer
└── includes/
    ├── class-nlweb-protocol.php       # NLWeb v0.55 protocol handler
    ├── class-nlweb-handler.php        # Main query orchestrator
    ├── class-nlweb-decontextualizer.php
    ├── class-nlweb-query-rewriter.php # Query fanout implementation
    ├── class-nlweb-retriever.php      # WordPress search integration
    ├── class-nlweb-ranker.php         # LLM-based ranking
    ├── class-nlweb-llm.php            # Multi-provider LLM wrapper
    ├── class-nlweb-settings.php       # Admin settings UI
    └── class-nlweb-frontend.php       # Shortcode renderer

Configuration

No hardcoded API keys (stored securely in WordPress options)
Admin UI for configuring LLM providers and models
Configurable settings:
- LLM provider (OpenRouter, Anthropic, Gemini)
- Models for high/low priority tasks
- Query fanout enable/disable
- Post types to search
- Ranking thresholds

Usage

Install plugin in WordPress wp-content/plugins/ directory
Activate plugin in WordPress admin
Navigate to Settings → NLWeb to configure API keys and models
Add [nlweb_search] shortcode to any page
Users can now search with natural language queries

Testing

Tested with:

WordPress 6.7.2
3,220 sci-fi movies with schema.org metadata
OpenRouter with gpt-4o and gpt-4o-mini models
Query fanout working correctly with debug output
ChatGPT-style interface rendering properly

Files Not Included

This PR intentionally excludes:

code/wordpress/docker-compose.yml (test environment setup)
code/wordpress/setup.sh (test environment setup)
API keys (configured via admin UI)

Pull request overview

This PR introduces a comprehensive WordPress plugin that implements the NLWeb v0.55 protocol for conversational search. The plugin integrates multiple LLM providers (OpenRouter, Anthropic, Gemini) to provide AI-powered search capabilities with features like query decontextualization, query fanout, and LLM-based ranking.

Changes:

Adds complete WordPress plugin with admin settings, REST API endpoints, and conversational UI
Implements NLWeb v0.55 protocol with support for multi-turn conversations and query processing pipeline
Integrates with WordPress search, Yoast SEO, and optional WooCommerce for schema.org metadata extraction

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
`nlweb.php`	Main plugin file with REST API endpoints for /ask and MCP
`class-nlweb-settings.php`	Admin configuration UI for LLM providers and plugin settings
`class-nlweb-protocol.php`	NLWeb v0.55 protocol request/response handler
`class-nlweb-llm.php`	Multi-provider LLM wrapper with parallel request support
`class-nlweb-retriever.php`	WordPress search integration with schema.org extraction
`class-nlweb-ranker.php`	LLM-based relevance scoring for search results
`class-nlweb-query-rewriter.php`	Query fanout implementation for complex queries
`class-nlweb-handler.php`	Main orchestrator for the search pipeline
`class-nlweb-frontend.php`	Shortcode renderer for conversational UI
`class-nlweb-decontextualizer.php`	Multi-turn conversation context handling
`nlweb-mcp-renderer.js`	MCP Apps iframe rendering with postMessage bridge
`nlweb-frontend.js`	ChatGPT-style conversational interface
`nlweb-frontend.css`	Modern responsive styling for the UI

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T22:01:03Z

code/wordpress/nlweb/includes/class-nlweb-llm.php

+                        'headers' => array(
+                            'Content-Type'      => 'application/json',
+                            'x-api-key'         => $api_key,
+                            'anthropic-version'  => '2023-06-01',


The 'anthropic-version' header has a typo - it should be 'anthropic-version' (which it is), but note that Anthropic's current API version is '2023-06-01' as shown. However, Anthropic recommends using more recent API versions. As of 2024, '2024-01-01' or later versions are available. Consider updating to a more recent API version or making this configurable.

Suggested change

'anthropic-version' => '2023-06-01',

'anthropic-version' => '2024-01-01',

Copilot · 2026-02-25T22:01:03Z

code/wordpress/nlweb/includes/class-nlweb-handler.php

+            'top_score' => ! empty( $ranked ) ? $ranked[0]['score'] : null,
+        );
+
+        /* ---- 4. Optionally summarize ------------------------------ */


Inconsistent comment numbering: The pipeline step is labeled as "4. Optionally summarize" but it's actually step 5 (after 1. Decontextualize, 2. Query Fanout, 3. Retrieve, 4. Rank). Update the comment to say "5. Optionally summarize" for accuracy.

Suggested change

/* ---- 4. Optionally summarize ------------------------------ */

/* ---- 5. Optionally summarize ------------------------------ */

Copilot · 2026-02-25T22:01:04Z

code/wordpress/nlweb/includes/class-nlweb-query-rewriter.php

+		$seen_ids    = array();
+
+		foreach ( $queries as $query ) {
+			$results = NLWeb_Retriever::search( $query, $max_per_query );
+
+			foreach ( $results as $item ) {
+				$id = $item['id'] ?? null;
+
+				// Skip if we've already seen this item
+				if ( $id && in_array( $id, $seen_ids, true ) ) {
+					continue;
+				}
+
+				$all_results[] = $item;
+				if ( $id ) {
+					$seen_ids[] = $id;


The fanout_search method accesses $item['id'] which doesn't exist in the retriever results. The retriever returns tuples with [url, schema_json, name, site] (line 46 of class-nlweb-retriever.php) but doesn't include an 'id' field. This will cause undefined index errors and the deduplication logic won't work correctly. The deduplication should use 'url' as the unique identifier instead.

Suggested change

$seen_ids = array();

foreach ( $queries as $query ) {

$results = NLWeb_Retriever::search( $query, $max_per_query );

foreach ( $results as $item ) {

$id = $item['id'] ?? null;

// Skip if we've already seen this item

if ( $id && in_array( $id, $seen_ids, true ) ) {

continue;

}

$all_results[] = $item;

if ( $id ) {

$seen_ids[] = $id;

$seen_urls = array();

foreach ( $queries as $query ) {

$results = NLWeb_Retriever::search( $query, $max_per_query );

foreach ( $results as $item ) {

$url = null;

// Support both associative arrays and numeric tuples: [url, schema_json, name, site].

if ( is_array( $item ) ) {

if ( array_key_exists( 'url', $item ) ) {

$url = $item['url'];

} elseif ( array_key_exists( 0, $item ) ) {

$url = $item[0];

}

}

// Skip if we've already seen this URL

if ( $url && in_array( $url, $seen_urls, true ) ) {

continue;

}

$all_results[] = $item;

if ( $url ) {

$seen_urls[] = $url;

Copilot · 2026-02-25T22:01:04Z

code/wordpress/nlweb/includes/class-nlweb-llm.php

+        switch ( $provider ) {
+
+            case 'openrouter':
+                $url = $endpoint ?: 'https://openrouter.ai/api/v1/chat/completions';
+                return array(
+                    'url'  => $url,
+                    'args' => array(
+                        'timeout' => $timeout,
+                        'headers' => array(
+                            'Content-Type'  => 'application/json',
+                            'Authorization' => "Bearer $api_key",
+                            'HTTP-Referer'  => get_site_url(),
+                            'X-Title'       => 'NLWeb WordPress Plugin',
+                        ),
+                        'body' => wp_json_encode( array(
+                            'model'    => $model,
+                            'messages' => array(
+                                array( 'role' => 'system', 'content' => $system_msg ),
+                                array( 'role' => 'user',   'content' => $prompt ),
+                            ),
+                            'response_format' => array( 'type' => 'json_object' ),
+                            'max_tokens'      => 512,
+                            'temperature'     => 0,
+                        ) ),
+                    ),
+                );
+
+            case 'anthropic':
+                $url = $endpoint ?: 'https://api.anthropic.com/v1/messages';
+                return array(
+                    'url'  => $url,
+                    'args' => array(
+                        'timeout' => $timeout,
+                        'headers' => array(
+                            'Content-Type'      => 'application/json',
+                            'x-api-key'         => $api_key,
+                            'anthropic-version'  => '2023-06-01',
+                        ),
+                        'body' => wp_json_encode( array(
+                            'model'      => $model,
+                            'max_tokens' => 512,
+                            'system'     => $system_msg,
+                            'messages'   => array(
+                                array( 'role' => 'user', 'content' => $prompt ),
+                            ),
+                        ) ),
+                    ),
+                );
+
+            case 'gemini':
+                $url = $endpoint
+                     ?: "https://generativelanguage.googleapis.com/v1beta/models/{$model}:generateContent?key={$api_key}";
+                return array(
+                    'url'  => $url,
+                    'args' => array(
+                        'timeout' => $timeout,
+                        'headers' => array(
+                            'Content-Type' => 'application/json',
+                        ),
+                        'body' => wp_json_encode( array(
+                            'contents' => array(
+                                array(
+                                    'parts' => array(
+                                        array( 'text' => $system_msg . "\n\n" . $prompt ),
+                                    ),
+                                ),
+                            ),
+                            'generationConfig' => array(
+                                'responseMimeType' => 'application/json',
+                                'maxOutputTokens'  => 512,
+                                'temperature'      => 0,
+                            ),
+                        ) ),
+                    ),
+                );
+
+            default:
+                return null;
+        }


The LLM wrapper doesn't handle the 'openai' provider case but the switch statement includes 'openrouter'. Based on the settings page (line 206 showing 'OpenAI' label for the 'openai' key), there's a mismatch. The build_request method should either support 'openai' directly or the settings should use 'openrouter' as the key. Currently, selecting OpenAI in settings will return null from build_request and fail silently.

Copilot · 2026-02-25T22:01:04Z

code/wordpress/nlweb/includes/class-nlweb-query-rewriter.php

+	private static function build_prompt( $query ) {
+		return <<<PROMPT
+You are helping to rewrite a complex search query into simpler keyword queries for a traditional keyword-based search engine.
+The search engine works best with short, focused queries containing important keywords.
+
+Take the following query and break it down into up to 5 simpler search queries.
+Each query should:
+- Contain no more than 3 words
+- Focus on the most important keywords and concepts
+- Be diverse to cover different aspects of the original query
+- Use only essential nouns, adjectives, or product terms
+- Avoid common words like "for", "the", "some", "are", "that", "would", "be"
+
+For example:
+- "what are some options for plates that would be appropriate for serving vegetables" → ["vegetable plates", "serving plates", "dinner plates", "salad plates", "ceramic plates"]
+- "looking for a tea pot that can brew green tea" → ["tea pot", "green tea", "teapot ceramic", "japanese teapot", "brewing pot"]
+- "movies about alien invasions in the 1950s" → ["alien invasion", "1950s scifi", "alien movies", "invasion films", "scifi 1950s"]
+
+The original query is: {$query}
+PROMPT;


The prompt instructs the LLM to limit queries to "no more than 3 words" but the example shows queries with more than 3 words like "vegetable plates" (2 words is fine), "serving plates" (2), "dinner plates" (2), "salad plates" (2), "ceramic plates" (2), "japanese teapot" (2), "brewing pot" (2), "alien invasion" (2), "1950s scifi" (2), "alien movies" (2), "invasion films" (2), "scifi 1950s" (2). The examples actually follow the rule, but for more complex queries, this 3-word limit might be too restrictive. Consider allowing 3-5 words for more nuanced queries.

Copilot · 2026-02-25T22:01:07Z

code/wordpress/nlweb/includes/class-nlweb-decontextualizer.php

+        $schema = array(
+            'requires_decontextualization' => 'True or False',
+            'decontextualized_query'       => 'The rewritten query, if decontextualization is required',
+        );
+
+        $response = NLWeb_LLM::ask( $prompt, $schema, 'low' );
+
+        if ( empty( $response ) || ! isset( $response['requires_decontextualization'] ) ) {
+            // LLM failed — fall back to the original query.
+            return array(
+                'query'                       => $raw_query,
+                'requires_decontextualization' => false,
+            );
+        }
+
+        $needs = ( 'True' === $response['requires_decontextualization'] || true === $response['requires_decontextualization'] );


The schema prompt parsing expects boolean values as strings 'True' or 'False' (line 65) but also checks for actual boolean true. This dual handling is good for robustness, but the schema definition on line 51 should clarify the expected format. Consider instructing the LLM to return actual boolean values (true/false without quotes) instead of string values for better type safety.

Copilot · 2026-02-25T22:01:07Z

code/wordpress/nlweb/includes/class-nlweb-ranker.php

+            return $json_str;  // not valid JSON — send as-is
+        }
+
+        // If it's a list, take first element.


Typo in comment: 'wrap' should be 'wrapped'. The comment says "If it's a list, take first element" but would be clearer as "If it's an array, take the first element" for consistency with PHP terminology.

Suggested change

// If it's a list, take first element.

// If it's an array, take the first element.

Copilot · 2026-02-25T22:01:07Z

code/wordpress/nlweb/includes/class-nlweb-query-rewriter.php

+if ( ! defined( 'ABSPATH' ) ) {
+	exit;
+}


Mixed indentation detected: this file uses tabs while all other PHP files in the plugin use spaces. This violates WordPress coding standards which prefer tabs for PHP files, but creates inconsistency within the plugin codebase. For consistency, either convert all files to use tabs or convert this file to use spaces to match the rest of the plugin.

Copilot · 2026-02-25T22:01:07Z

code/wordpress/nlweb/assets/nlweb-frontend.css

+/* Hide WordPress page title on pages with NLWeb search */
+.page .wp-block-post-title {
+    display: none !important;
+}


The CSS selector '.page .wp-block-post-title' globally hides all page titles on pages, which is too aggressive. This will hide page titles site-wide even on pages that don't use the NLWeb search. Instead, add a specific class to the shortcode container's parent and only hide titles within that context, or let users control this via their theme.

Copilot · 2026-02-25T22:01:08Z

code/wordpress/nlweb/includes/class-nlweb-retriever.php

+                    return $graph;
+                }
+            } catch ( \Exception $e ) {
+                // Yoast internals changed — fall through.


Missing error handling: The Yoast schema extraction wraps the code in a try-catch but returns null on any exception. This silently swallows errors and makes debugging difficult. Consider logging the error before returning null so administrators can identify when Yoast integration fails.

Suggested change

// Yoast internals changed — fall through.

error_log(

sprintf(

'NLWeb_Retriever: Yoast schema extraction failed for post ID %d: %s',

isset( $post->ID ) ? $post->ID : 0,

$e->getMessage()

)

);

// Yoast internals changed or threw an error — fall through to other schema sources.

…ation in `postMessage` handler Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

rvguha · 2026-02-26T00:52:03Z

@copilot open a new pull request to apply changes based on the comments in this thread

Copilot · 2026-02-26T00:52:10Z

@rvguha I've opened a new pull request, #405, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI review requested due to automatic review settings February 25, 2026 21:54

Copilot started reviewing on behalf of rvguha February 25, 2026 21:54 View session

github-advanced-security bot found potential problems Feb 25, 2026

View reviewed changes

code/wordpress/nlweb/assets/nlweb-mcp-renderer.js Fixed Show fixed Hide fixed

Copilot AI reviewed Feb 25, 2026

View reviewed changes

Potential fix for code scanning alert no. 205: Missing origin verific…

cf85418

…ation in `postMessage` handler Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Copilot AI mentioned this pull request Feb 26, 2026

WordPress NLWeb plugin: apply review feedback #405

Draft

	'anthropic-version' => '2023-06-01',
	'anthropic-version' => '2024-01-01',

	/* ---- 4. Optionally summarize ------------------------------ */
	/* ---- 5. Optionally summarize ------------------------------ */

	// If it's a list, take first element.
	// If it's an array, take the first element.

-                // Yoast internals changed — fall through.
+                error_log(
+                    sprintf(
+                        'NLWeb_Retriever: Yoast schema extraction failed for post ID %d: %s',
+                        isset( $post->ID ) ? $post->ID : 0,
+                        $e->getMessage()
+                    )
+                );
+                // Yoast internals changed or threw an error — fall through to other schema sources.

Conversation

rvguha commented Feb 25, 2026

Summary

Features

Core Functionality

Integrations

User Interface

Plugin Structure

Configuration

Usage

Testing

Files Not Included

Related Documentation

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

rvguha commented Feb 26, 2026

Uh oh!

Copilot AI commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants