with TCP, automatic reconnect on error may not be desired.

Bug #211460 reported by Todd Denniston
2
Affects Status Importance Assigned to Milestone
libmodbus
Fix Released
Medium
Stéphane Raimbault

Bug Description

per discussion at:
https://answers.launchpad.net/libmodbus/+question/28759

TCP connect can hang (for 3 to 190 seconds in practice) if the TCP connected rtu is not responding, which is a bad thing in a near/soft real time system.
doing an automatic reconnect on error means the application will hang until connect returns, instead of allowing the application to chose when it wants to risk the hang.
That is, the application might instead want to ping (with timeout) the host occasionally and only try to reconnect when the app knows the remote host is up.

I am not stuck on the names of variables or the type I added in this patch, feel free to change them to match better with the project.

A note, which I did not add in the patch...
the application should close and reconnect before attempting to use the TCPrtu again when there are errors that call error_treat[1], because with my setup I did see anomalies when using the same connection after a fault that would have caused a reconnect.

[1] any of the faults under /* Local */

Related branches

Revision history for this message
Todd Denniston (todd-denniston) wrote :
Revision history for this message
martin (intenseheart-2004) wrote :

when iam trying to connect the deive through TCO it shows me
connecting to 192.168.0.100 and it takes like 3 minutes
and after 3 minutes it display
connect:connection timed out
what does it mean?
does it connected to my device fro a long time
or it did not connected to my device

can you suggest me anything?

Revision history for this message
Stéphane Raimbault (sra) wrote : Re: [Bug 211460] Re: with TCP, automatic reconnect on error may not be desired.

Fine but I will to some changes to apply your patch on trunk.
The error management mustn't change the API, so I will use something like that:

modbus_init_tcp(&mb_param, "192.168.0.100", MODBUS_TCP_DEFAULT_PORT);

by defaut, the MODBUS_RECONNECT mode will be set, so the developer must use:
modbus_set_tcp_error_treat(&mb_param, MODBUS_ABORT);
to change the default setting.

Typo TPC -> TCP in your patch.
mb_param->error_treat_tcp_handling -> mb_param->error_treat_tcp

Stéphane.

Revision history for this message
Todd Denniston (todd-denniston) wrote :

Stéphane Raimbault wrote, On 04/03/2008 05:13 PM:
> Fine but I will to some changes to apply your patch on trunk.
> The error management mustn't change the API, so I will use something like that:
>
> modbus_init_tcp(&mb_param, "192.168.0.100", MODBUS_TCP_DEFAULT_PORT);
>
??? this must be from one of the branches you are working on as 1.2.4 did not have the place to put the port in the init line.
But I am OK with using the function you have named below.

> by defaut, the MODBUS_RECONNECT mode will be set, so the developer must use:
> modbus_set_tcp_error_treat(&mb_param, MODBUS_ABORT);
> to change the default setting.
>

the name for the enum you have given here worries me. ABORT to me means end the program, I intend for it to just skip the reconnect. Perhaps a more descriptive enum would be good here, like: MODBUS_NO_AUTO_RECONNECT or MODBUS_NO_RECONNECT

> Typo TPC -> TCP in your patch.

Agggh! a mistake! :)

> mb_param->error_treat_tcp_handling -> mb_param->error_treat_tcp
>
Looks good.

Revision history for this message
Stéphane Raimbault (sra) wrote :

ABORT was stupid!

I don't speak English, could you comment my patch (on trunk branch), please?

Revision history for this message
Todd Denniston (todd-denniston) wrote : Re: [Bug 211460] Re: with TCP, automatic reconnect on error may not be desired.

martin wrote, On 04/03/2008 05:01 PM:
> when iam trying to connect the deive through TCO it shows me
> connecting to 192.168.0.100 and it takes like 3 minutes
> and after 3 minutes it display
> connect:connection timed out
> what does it mean?

The library was not able to establish a connection with your device.
for me:
3 second hang = network physically down.
180 second hang = a LAN gateway/bridge is attempting to make the connection
for you.

> does it connected to my device fro a long time
> or it did not connected to my device
>
>
> can you suggest me anything?
>

0) if this reply does not fix this for you, please take the rest of this
discussion to the "answers" board[1], so we don't keep the bug active after
stephane gets the code changed, because I am fairly certain you have a problem
only peripherally related to the bug.
1) find out what the MAC (hardware Ethernet) address of your device is.
2) ping -c2 -w2 192.168.0.100
3) /sbin/arp -n |grep -e 192.168.0.100 -e HWaddress
4) compare the arp output with what you know the device should be.
I suspect that:
A) the ping will have 100% packet loss.
B) the arp address output will not match the MAC of the device.

Assuming A & B are right: you will need to first resolve why your control
machine can not talk to the device. If you can, it is easier to troubleshoot
these if it is only the two devices connected to the CAT5 cable or network hub.

Hope this helped.

[1] https://answers.launchpad.net/libmodbus/
click "Ask a question", fill in blanks :)
I have set my self as an answer contact so I get an email ANY time someone
posts a question or answer there.

--
Todd Denniston

Revision history for this message
Todd Denniston (todd-denniston) wrote :

Stéphane Raimbault wrote, On 04/03/2008 05:47 PM:
> ABORT was stupid!
>
> I don't speak English, could you comment my patch (on trunk branch),
> please?
>
>
> ** Attachment added: "patch-error-treat-todd.patch"
> http://launchpadlibrarian.net/13100082/patch-error-treat-todd.patch
>

I looked at the patch (as I can't use bazzar from work) and it looks mostly good.

the ONLY alteration I would suggest is that modbus_set_error_handling should
return the value mb_param->error_handling on success and -1 on fault, or just
always return mb_param->error_handling.
--
Todd Denniston

Revision history for this message
Todd Denniston (todd-denniston) wrote :

Actually, scratch the suggestion for change to modbus_set_error_handling, as none of the other init functions return anything.

Patch is good to go!

Revision history for this message
martin (intenseheart-2004) wrote :

Hey ! I just figure out that connect function is sending -1
ret=connect(mb_param->fd(strct sockaddr*)&addr,sizeof(strctsockaddr_in));

ret=-1
what you think where the problem is?
if I gave my IP address 198.168.111.11
than.
addr.sin_addr.s_addr=inet(mb_param->ip)
is writening
1785654544..something like
you think this is the problem?

Revision history for this message
martin (intenseheart-2004) wrote :

I was wondering If I use my IP address 192.168.111.11
than I have to change MODBUS_TCP_PORT from 502 to someother number?
I was wondering why it is not connecting to my device?

Revision history for this message
martin (intenseheart-2004) wrote :

i really did not get the conversation and I did not get what patch I have to add in my programm
can someone place clear patch here so that I can copy in program?

Revision history for this message
Todd Denniston (todd-denniston) wrote : Re: with TCP, automatic reconnect on error may not be desired.

martin wrote, On 04/04/2008 04:00 PM:
> i really did not get the conversation and I did not get what patch I have to add in my programm
> can someone place clear patch here so that I can copy in program?
>

Please follow the link:
https://answers.launchpad.net/libmodbus/+question/28929

--
Todd Denniston

Revision history for this message
Stéphane Raimbault (sra) wrote :

Martin,

A bug report is not a forum, can you only ask your questions with the Answers tools, please?

Revision history for this message
Stéphane Raimbault (sra) wrote :

Thank you Todd.

Changed in libmodbus:
assignee: nobody → sra
importance: Undecided → Medium
milestone: none → 2.0
status: New → Fix Committed
Revision history for this message
Todd Denniston (todd-denniston) wrote :

Any chance of seeing a 1.2.5 cut with this patch?
Thanks.

Revision history for this message
Stéphane Raimbault (sra) wrote :

I'm intend to release libmodbus 2.0 shortly so I don't want to maintain the 1.2 series.

Changed in libmodbus:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.