When web version of AS help will be avalable to be fetched by LLMs directly?

When web version of AS help will be available to be fetched by LLMs directly from web?

2 Likes

As far as I know, this is in the backlog. Implementation depends on the number of requests from users. So, if you want to increase your chances or be officially informed, you must get in contact with colleagues from support.

Just to give my opinion on this: I agree with Vratislav, it would be very useful to make the Automation Help content available for AI crawlers, because it would take the B&R knowledge of ChatGPT & Co. to the next level.

BG Alex

5 Likes

The sad part is the ā€œAI crawlersā€ category includes some bad actors who relentlessly pound webservers who allow them access. On the order of gigabits or even terabits of traffic a day.

Yes proxy and caching servers help. But at some point there is always a bottleneck which will degrade performance for all.

Alot of AI scrapers aren’t following robots.txt or other systems that had been previously the SOP for traffic and consent on the internet.

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries - Ars Technica

Threat Spotlight: The good, the bad, and the ā€˜gray bots’ – the Gen AI scraper bots targeting your web apps | Barracuda Networks Blog

An AI Scraping Tool Is Overwhelming Websites With Traffic

Until the balance has been reached between ease of access and responsible use, I’m okay with B&R not opening the flood gates and degrading everyone’s performance or worse, costing B&R too much traffic that they take the help offline or put it behind a log in.

1 Like

Hi @Matt_Buck,

thanks for that insights!
As I’m not so familiar with these topics (but I understand the danger behind): are there any other ways to ā€œfeed the AI with dataā€ in a more stable way then robots.txt / llm.txt?

Best regards!

1 Like

I haven’t explored AI enough to have that answer.

I know there are AI’s that can run locally, or AI providers than can store data if you have an account. The likely problem is that the generic, free or cheap AI systems will never cache the data they access. And even if your queries are logged and stored, it might not be deduplicating the data it’s retrieved. So if you have 10 different queries regarding the same B&R error code on 10 different days and that page of the help hasn’t changed in those 10 days, it may still be saving 10 identical copies of the data in the query history. I.e. it’s cheaper for the AI company to always query the information from the original source than trying to deduplicate the data in the query history.

Automation Studio in the past has done the same thing more or less. A engineer could have 5 copies of the same identical project at different file paths and each project would have been different checksums and would have caused initial installations on the PLC if they transferred back-to-back from any 2 of the 5 projects. (I believe there was a setting you could change to reduce these issues, but by default this was the behavior. And if you trying to use debugger you were out of luck).

1 Like

Ok. I understand. Just image and compare to beckhoff Infosys. I can querry my agent to get latest information from Infosys. With B&R I have now MCP+RAG over B&R help, but this is still not updated regularly(depends on my version of AS help). Why B&R cant do separate server with AS help, accesible over some 1-week valid API tokens which I can get at B&R support portal, to give to my agent to fetch AS help? Only agent with API key can fetch and then you eliminate unwanted trafic?..Just an idea.

Hello,

Just a question for LLM dummys like me…

If i go on my local pc to "AS6\Help-de\Data"
I can find the Subfolder ā€œAS6\Help-de\Data\motion\mapp_motionā€ for example containing the Help pages as HTML ā€œmapp_motion.htmlā€

I know this is not an online Web-Page with links to each file. But there are all Pages as HTML. Of course you have to download all MappUpdates. But these Data does not change so regularly.

Can’t you send your LLM to learn from this Data?

Greetings
Michael

That’s a question to be directed to B&R IT department, likely through your sales channels/key account support.

Your suggestion is theoretically possible; ease of implementation, long term support needs, CRA compliance, etc would likely all be weighed by their IT department before they would start working on an internal test version.

A local LLM could likely access it if the LLM was running on the same system as the help was installed in. Uploading the gigabytes of help files is unlikely to be easy.

1 Like

@michael.bertsch send direct help eats lots of token, B&R Help MCP server is good solution, but also eats a lot of tokens(but significantly less than send help directly to LLM). Better solution is MCP server+RAG+vectorDB+embedding LLM->agent. This is probably best solution for now.

flowchart TD
  HtmlXml[HTML/XML Files] --> Extract[Extract Text]
  Pdfs[PDF Files] --> Convert[Convert to Text]
  Extract --> Chunk[Chunk Text]
  Convert --> Chunk
  Chunk --> Embed[Create Embeddings]
  Embed --> VecDB[Vector DB]
  UserQuery[User Query] --> MCP[MCP Search]
  MCP --> VecDB
  VecDB --> Results[TopK Snippets]
  Results --> Cursor[Cursor Context]

But this depends on your configuration, Update must be handled by you and every time you upgrade help, you have to teach your emebding LLM again…

1 Like

I’d like to suggest a possible workaround, though it’s just a starting point. I extracted the full local help into formatted PDFs using a tool created with AI.

Since the documentation updates are infrequent and rarely involve major changes, this is currently a very effective way to provide the agent with the complete AS help content.

Could you share this tool to create PDFs, because help format in many folders and subfolders and http documents, is not the best option for AI to access information.