APC3100: Ethernet connection lost

Environemt: APC3100 with AR E4.92, quite large project with a lot of OPC-UA PVs due to usage of mappView, ethernet connection to ETH1 of the built in RJ45 plugs (not an interface on an extension card)

Every once in a while we lose communication with the APC. Sometimes after download but also during normal usage. In this situation not even the AS can find the APC3100 in the online settings! In this state all other network stations respond to ping commands (Panel-PCs, Code-Readers, …). Only the PLC is unreachable. So we ruled out a faulty ethernet switch.

To us it looks like the ethernet interface at the PLC itself just stopped working.
If we just unplug the ethernet cable from the APC an reconnect it, all returns to normal. When this occures there are absolute no message in any of the B&R logbooks. It just shows a mappView client disconnect, which is most likely an effect of the lost communication and not the cause.

Has anyone encounterd similiar effects?

Do you have any idea what the CPU utilization is when the connection breaks?

You could check this from System Diagnostics Manager, even after the event.

Hi Michael,

I’ve already seen this type of behavior, we lost communication of the APC3100. This was due to OpcUa client from another APC3100 that trying to read variable but with wrong type. This resulted in someting like CPU usage of the APC3100 goes to 100% when the opcua client was connected.

The thing that the PLC didn’t go to service made the diagnostic more complexe.

Fixing the variables datatypes was the solutions.

Please check if there is any communications with this APC3100 using OPC UA.

I will try to check with my local support if they know what ticket number it was maybe it could help :slight_smile:

The CPU load is quite stable between 43% and 47% during all operations. Also is the TcIdleFiller around 10%. So we don’t think the CPU load may be an issue.

Bth: Thanks for your quick reply.

1 Like

If I remember correctly, with a ethernet cable directly connected from my PC to the APC I runned a profiler, then I switch ethernet cable to the machine one. Wait until the APC wasn’t reachable anymore, then reswitch to my PC, get back the profiler and I see that CPU Usage of task (someting like tOpcuaComm) was taking almost 100% CPU usage.

Thank you very much for your input.

Unfortunately the error accurs not frequently enough. I’ll try to think of something to catch such a behaviour.

You could maybe detect peaks of CPU usage using “AsArProf“ lib, then run a profiler using same library :slight_smile:
I’ve never used this lib, but it seems it’s possible :smiley:

We already use the FUB LogIdleShow() to monitor the CPU load constantly. There are also our own log messages attached. I’m actually searching for those entrys in our log.

Thank you again.

1 Like

Hi,

I can’t remember in detail, but some years ago I saw something comparable. Even if everything else worked like expected, in the end it was nevertheless the switch.
In my case, the switch has had some internal overload or buffer overflow, which lead to a stuck on the interface the PLC was connected to (and maybe also to a loss of the arp table information) - so no longer any data was transported over that interface, until the cable was unplugged (which lead on both sides to the hardware link loss event and some kind of chip reset), and plugged again.

As it wasn’t a managed switch , we weren’t able to do a more detailed analysis on switch / ethernet level.
But we connected a additional hub between the PLC and the switch connection, and used wireshark for evaluating the network traffic from and to the PLC.
In addition, we were sending some active outbound communication from the PLC (by cyclic calling IcmpPing() or UdpSend(), unfortunately I can’t remember).
As soon as the connection loss happened, we saw in wireshark that the PLC still tried to send data out of the interface, which lead us to the conclusion that the root cause was the switch.

I’m aware that this was a very special issue, quite unlikely, and we had the luck to getting this error once or twice a day what made our evaluation at least possible.
But I wanted to mention it nevertheless.

From a error detection point of view, I would at least recommend to add some outgoing ping in the PLC, every 1-2 seconds or so, and maybe adding the AsEth - EthStat() function block reading the interface statistics - I’m aware that if the connection is lost, you can’t display anything on your terminal, but for getting more information it could be useful (e.g. by adding some user logger entries when connection is lost).

Best regards!

3 Likes

Dear Alexander
thank you for your long answer. I think we try to figure it out in tihs direction. The check of the CPU load showed no results. The load was not unusualy high around the time the connection was lost. So it may well be the same situation as you had.

2 Likes

It should be noted that OPC UA, along with ANSL/PVI, VC3, VC4, all happen in the idle task class of the PLC, and therefore are limited to the idle of the PLC. I’ve experienced both AS communication issues, OPC UA issues, VC4 losing connection when the PLC is running at or near 100% usage. This is the expected behavior in these states as the PLC programs take precedence over these things.

1 Like