Back to Prompt Library
implementation

Implement Visual Interaction Tools with TabbyML

Inspect the original prompt language first, then copy or adapt it once you know how it fits your workflow.

Linked challenge: Build an Advanced Visual Web Agent with OpenAI Agents SDK and GPT-5 Pro

Format
Code-aware
Lines
8
Sections
1
Linked challenge
Build an Advanced Visual Web Agent with OpenAI Agents SDK and GPT-5 Pro

Prompt source

Original prompt text with formatting preserved for inspection.

8 lines
1 sections
No variables
1 code block
Expand the VisualNavigator agent's capabilities by adding tools for 'click_element_by_text' and 'extract_text_content_by_selector'. Use TabbyML as an AI coding assistant to generate the Playwright/Selenium code for these tools. These tools should simulate actual browser interactions. Ensure the agent can call these new tools effectively within the OpenAI Agents SDK framework. ```python
# Assume client and assistant are already initialized # Example of integrating a new tool schema
click_tool_schema = { "type": "function", "function": { "name": "click_element_by_text", "description": "Clicks a visible element on the page containing specific text.", "parameters": { "type": "object", "properties": { "text": {"type": "string", "description": "The text content of the element to click."} }, "required": ["text"] } }
} # Update assistant with new tool
# assistant = client.beta.assistants.update(assistant.id, tools=[navigate_tool_schema, click_tool_schema, ...]) # Your task: use TabbyML to assist in writing the Python function 'click_element_by_text(text)'
# that uses Playwright/Selenium to find and click the element.
# Integrate this function with your OpenAI Agents SDK tool execution loop.
```

Adaptation plan

Keep the source stable, then change the prompt in a predictable order so the next run is easier to evaluate.

Keep stable

Hold the task contract and output shape stable so generated implementations remain comparable.

Tune next

Update libraries, interfaces, and environment assumptions to match the stack you actually run.

Verify after

Test failure handling, edge cases, and any code paths that depend on hidden context or secrets.