Using two AsEthIP Library Instances on X90CP174 Controller Crashes the PLC

Hi Community,

I have a very interesting issue here. I’m using X90CP174 which has 2 Ethernet ports. One is Ethernet (IF2) and the other is PowerLink (IF3) but can be configured as Ethernet (IF3.ETH) as well.

I configured them as Ethernet ports and let them sit on two different subnets.

Then I created 2 POUs that are using the B&R library AsEthIP to communicate with a Rockwell PLC with 2 different Ethernet/IP interfaces. These 2 POUs are almost identical except some tag name differences and one of them has a condition check before running. Both of them are running at the same time in this case.

The issue is the X90 PLC will crash and reboot to service mode after a few minutes/hours of running and the logger indicates that it is caused by the AsEthIP program running on IF3 configured from PowerLink. This POU runs at 80ms cycle while the other one runs at 50ms.

I can confirm that if I only run one Ethernet/IP communication on IF2, there is no issue.

Taking a step back looking at this configuration, it looks awfully similar to make the X90 PLC a shared Ethernet/IP device to two scanners (though actually there is only one scanner in this case but two different subnets). Knowing that AsEthIP library is a simple driver that makes the B&R PLC an Ethernet/IP device, I wonder if this approach pushed it too far?

Also, can anyone give me some advice on how to resolve this?

Thank you, community.

Hi @c583869, that’s an interesting one indeed!

Could you post a screenshot that shows all of the Logger messages that are created when the crash happens, or maybe even a system dump that contains the relevant logs? That would help us to diagnose the issue.

2 Likes

Hi Marcus,

Thank you for your reply.

Here’s the system dump.

I currently have some theory: the B&R library AsEthIP relies on some internal resources to perform and doesn’t support multiple instances. If I do it, the instances will have conflict over the resource.

I may need to let the IF3.ETH communication wait and only run if IF2 communication fails. However, how to manage resource and go back to IF2 when the communication is recovered can be challenging.

I also hope that I can verify this theory…

BTW, I’m new to B&R and can’t really understand much information from the system dump. Can you share some information about how to decipher the system dump file, please?

BuR_SDM_Sysdump_2024-06-19_22-01-46.tar.gz.zip (1.8 MB)

Thanks for sending the system dump!

The Ethernet/IP specification doesn’t support multiple Scanners with exclusive connections to the same Adapter. However, if you set up two instances of the AsEthIP function blocks on two different Ethernet interfaces which have two separate subnets, you sort of end up with two different Adapters. It looks like what you’re trying to do has worked in the past, but it is a bit of a niche use case.

The EIPInit function block, when executed successfully, returns a “handle” variable which is then passed to the other function blocks. This is essentially just a memory address where the function blocks will store and share data between themselves. The key is that these handles need to be unique. You want to make sure each set of function blocks is operating entirely within its own memory space.

Looking at your log files, it looks like every time you get an error, the Logger entry is the same - a memory access violation:

image

Unfortunately, this error is a little generic. It just means that something tried to read from/write to memory it shouldn’t be accessing. This issue usually stems from improperly used memcpys or pointers. However in this case, the Backtrace can be very helpful. If you click on a relevant entry in the Logger and then click on the Backtrace tab, each of the entries shows something like this:

If you have the relevant project open, some of these entries will show green arrows next to them. This means you can double click them and Automation Studio will take you to the line of code which caused the error. Note that multiple lines will be responsible (a program calls a function which uses a variable etc.) so I recommend looking at where each entry takes you. If the memory violation occurred within the AsEthIP library itself, you will not be able to see the code. However, if the problem can be traced back to something within your program it is more correctible.

Like I said, what you’re trying to do is a little unique, but I’m not seeing anything that makes it impossible as long as it’s done correctly. Something you might want to try is putting both programs in the same task class just to see if that makes a difference. I’m curious though, if both Scanners are on the same Rockwell PLC, why use two connections to the same B&R PLC rather than send all of the data using one connection and exchange data between programs internally on both devices?

Regarding reading and understanding the system dump, the system dump is just a tarball which contains system data in xml format as well as Logger and Profiler exports. It’s essentially a snapshot of the System Diagnostics Manager at the time of creation. There is a program on GitHub you can use to open and view the system dump’s data in a more readable format. This program is very helpful, but note that it is not officially provided or supported by B&R. You can also extract the tarball (with a program like 7Zip) to get the Logger and Profiler files, then open those in Automation Studio.

1 Like

Hi Marcus,

Thank you very much for your detailed information, it’s very helpful.

I now can confirm that the crash is caused by the cyclic communication function block of the AsEthIP which runs in the slower task. I was told these two AsEthIP instances need to run at least 50ms apart so one is running faster than the other.

It’s particularly interesting to know that the initialization program will reserve a memory area for the libarry function blocks. Considering this and the fast - slow relationship, I suppose the reason why the PLC crashes is that the two initialization function blocks don’t know about each other so if they reserve the same memory area, the PLC will crash.

I also found that the exit command from one instance will also affect the other. For example, I stopped the secondary communication, let the primary communication working properly then I start to initialize the secondary communication. If I run the exit command knowing the handle overlaps, the exit command affects both communications too.

So, correct me if I’m wrong, you can have two instances, but you cannot let them run simultaneously (no memory reserved, etc.).

Knowing this, the Ethernet/IP communication on the IF3.ETH port will have to wait and only initialize itself if the IF2 one fails. Switching between these two communications will require some further resource management so they never run at the same time.

Thank you for sharing the information about the GitHub repo as well.

An update on this before closing this topic.

By running EIPExit() and EIPInit() to manage the resource for the Ethernet/IP communication and having two Data Object for the IF2 and IF3.ETH interface configuration, I can run one AsEthIP instance each time I switch between IF2 and IF3.ETH. The switching time is very slow (it takes seconds) but in my application it’s acceptible.

This approach is not recommended by B&R but it seems to be working for me.

2 Likes

Happy to help, and thanks for the updates! I’ve never tried this myself so I’m glad you found a solution and were able to post the details here.

It makes sense that you’re running into memory issues if both instances of EIPInit() are reserving the same location in memory. There may be a way to avoid this, but unless someone else has an idea here you’d have to reach out to your local B&R team to create a Support ticket for more information.

2 Likes

Hi Marcus,

It’s been a while but I’m happy to share a further update.

I managed to run two AsEthIP instances “at the same time” and I think more is possible. Here’s what I have done:

  1. Use a timer to create a scheduler. In my case, the timer runs at 400ms cycle.
  2. During 0 - 100ms, run instance 1;
  3. During 100 - 200ms, run none;
  4. During 200 - 300ms, run instance 2;
  5. During 300 - 400ms, run none;

By doing this, the instances never run at the same time and are separated by 100ms. As far as I know, any gap bigger than 50ms is acceptable. I have been running my program in this approach for a day and it is fine.

1 Like

That’s neat; I’m glad you found something that works. Thanks for sharing!

1 Like