Post Reply 
 
Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
dexer
01-23-2017, 08:41 PM
Post: #21
RE: dexer
(01-22-2017 09:37 PM)lingu Wrote:  
(01-22-2017 04:12 PM)zma Wrote:  
(01-22-2017 03:10 PM)YU_Xinjie Wrote:  We may consider to make the "return non-zero when the cmd on any node fails" to be the default behaviour of dexer.
It is usually helpful to exit as soon as possible when fails.
If dexer is run as root, it is even more important to stop when fails.

An implementation can be referred is the wait_errexit function in glad-common.sh.

@Zhiqiang @Wentao @lingu

What is your opinion?

Dying early and reporting early sounds a good idea to me as the default behavior.

If the users want to tolerate the failures in dexer, there are many way to be used.

In distributed execution ("-p"), a scatter-gather is better than strict early termination which requires preemptive killing.

In any case, I think returning non-zero and early termination are two issues and in this case we handle returning non-zero only. It is fine that, if there is any failure from any scatter task on any node, dexer return non-zero.

Agreed. This is exactly what I want to say and how wait_errexit() of glad works.
Will propose a detailed design later.
Find all posts by this user
Quote this message in a reply
01-24-2017, 10:52 AM
Post: #22
RE: dexer
(01-23-2017 08:41 PM)YU_Xinjie Wrote:  
(01-22-2017 09:37 PM)lingu Wrote:  
(01-22-2017 04:12 PM)zma Wrote:  
(01-22-2017 03:10 PM)YU_Xinjie Wrote:  We may consider to make the "return non-zero when the cmd on any node fails" to be the default behaviour of dexer.
It is usually helpful to exit as soon as possible when fails.
If dexer is run as root, it is even more important to stop when fails.

An implementation can be referred is the wait_errexit function in glad-common.sh.

@Zhiqiang @Wentao @lingu

What is your opinion?

Dying early and reporting early sounds a good idea to me as the default behavior.

If the users want to tolerate the failures in dexer, there are many way to be used.

In distributed execution ("-p"), a scatter-gather is better than strict early termination which requires preemptive killing.

In any case, I think returning non-zero and early termination are two issues and in this case we handle returning non-zero only. It is fine that, if there is any failure from any scatter task on any node, dexer return non-zero.

Agreed. This is exactly what I want to say and how wait_errexit() of glad works.
Will propose a detailed design later.

To be clear, what I meant by "die early" is: if there are errors, stop invoking further more tasks, to make things simple. This is what wait_errexit() can be used for well.
Quote this message in a reply
05-31-2017, 11:40 AM
Post: #23
RE: dexer
@Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...
Find all posts by this user
Quote this message in a reply
05-31-2017, 11:57 AM
Post: #24
RE: dexer
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

BTW: it's strange bin's `make install` calls `sudo`. `make install` is supposed to do non-interactive installation while sudo may block it. Simple way is to require `make install` be run by root as in our common way.
Quote this message in a reply
06-02-2017, 01:43 PM
Post: #25
RE: dexer
(05-31-2017 11:57 AM)zma Wrote:  
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

@Zhiqiang
It is implemented but I find Sage/GLAD would report some warning/error message about tty when I am running test on gm55.
I will roll back the '-t' change first, and check it later.
Find all posts by this user
Quote this message in a reply
06-02-2017, 03:10 PM
Post: #26
RE: dexer
(06-02-2017 01:43 PM)YU_Xinjie Wrote:  
(05-31-2017 11:57 AM)zma Wrote:  
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

@Zhiqiang
It is implemented but I find Sage/GLAD would report some warning/error message about tty when I am running test on gm55.
I will roll back the '-t' change first, and check it later.

OK. I suggest removing the `sudo` from bin's Makefile for this case. Please take a look.
Quote this message in a reply
02-12-2018, 08:04 PM (This post was last modified: 02-12-2018 08:04 PM by rayluk.)
Post: #27
RE: dexer
RR @zma

Currently dexer would return 0 when some of its command failed. I think this should be a bug

Quote:[rayluk@tb31-2 ~]$ dexer 'a'
10.6.1.2: a
bash: a: command not found

10.6.2.2: a
bash: a: command not found

10.6.3.2: a
bash: a: command not found

[rayluk@tb31-2 ~]$ echo $?
0


However, since many tools have used dexer, changing its behaviour of may cause a disaster.
Therefore, I think we can let always return 0 to be a feature of dexer. If we want non-zero result when the command failed, we add one more parameter.
Find all posts by this user
Quote this message in a reply
02-12-2018, 08:15 PM
Post: #28
RE: dexer
(02-12-2018 08:04 PM)rayluk Wrote:  RR @zma

Currently dexer would return 0 when some of its command failed. I think this should be a bug

Quote:[rayluk@tb31-2 ~]$ dexer 'a'
10.6.1.2: a
bash: a: command not found

10.6.2.2: a
bash: a: command not found

10.6.3.2: a
bash: a: command not found

[rayluk@tb31-2 ~]$ echo $?
0


However, since many tools have used dexer, changing its behaviour of may cause a disaster.
Therefore, I think we can let always return 0 to be a feature of dexer. If we want non-zero result when the command failed, we add one more parameter.

The return code is undefined. But forcing behavior changing may break too many tools.

I suggest doing so:

add --check-status option which

- if --check-status is specified: return 0 if all commands on all nodes return 0; return non 0 otherwise.
- the return code is undefined if --check-status is not specified
Quote this message in a reply
02-12-2018, 08:22 PM (This post was last modified: 02-12-2018 08:24 PM by YU_Xinjie.)
Post: #29
RE: dexer
(02-12-2018 08:04 PM)rayluk Wrote:  RR @zma

Currently dexer would return 0 when some of its command failed. I think this should be a bug

Quote:[rayluk@tb31-2 ~]$ dexer 'a'
10.6.1.2: a
bash: a: command not found

10.6.2.2: a
bash: a: command not found

10.6.3.2: a
bash: a: command not found

[rayluk@tb31-2 ~]$ echo $?
0


However, since many tools have used dexer, changing its behaviour of may cause a disaster.
Therefore, I think we can let always return 0 to be a feature of dexer. If we want non-zero result when the command failed, we add one more parameter.

I roughly remember I have proposed this issue but perhaps is still a TODO.

I have implemented a function named "wait_errexit" in cod://glad/glad/bin/glad-common.sh, which is a "return code respect" version of "wait".
You may consider to generalize/reuse that function in your design.
Find all posts by this user
Quote this message in a reply
02-13-2018, 03:45 PM (This post was last modified: 02-13-2018 03:45 PM by rayluk.)
Post: #30
RE: dexer
@zma,

As from the previous discussion, I think we should let dexer return non-zero value when any of its child returned with a non-zero value.

I suggest we can follow the approach of mulop -- use an array of PID to keep track of the situation.

Since the pseudocode on headpost is not update, following is the current pseudocode as observed
Code:
$ips?=/thinker/etc/ips.cfg
if $ips not exist then $ips=$think_base/conf/ips.cfg
if 'p'  is a parameter then pastat=True; else pastat=False
if '--shell' is a parameter then use_shell=True; else use_shell=False

if use_shell then command = bash -c "$command"

if pastat == False
  for ip in $ips
  {
           ssh -t $username@$ip "$command"
   }
else
  for ip in $ips
  {
           ssh -t $username@$ip "$command" &
   }
}


pseudocode for non zero return value
Code:
$ips?=/thinker/etc/ips.cfg
if $ips not exist then $ips=$think_base/conf/ips.cfg
if 'p'  is a parameter then pastat=True; else pastat=False
if '--shell' is a parameter then use_shell=True; else use_shell=False

if use_shell then command = bash -c "$command"


(( -> globe_rr=0 ))
if pastat == False{
  for ip in $ips
   {
           ssh -t $username@$ip "$command" && rr=$? || rr =$?
           if $rr != 0 then $globe_rr = $rr
   }
else
  (( -> pid_array="" ))
  for ip in $ips
   {
           ssh -t $username@$ip "$command" & append $! to $pid_array
   }
   for $pid in $pid_array
   {
          wait $pid && rr=$? || rr =$?
          if $rr != 0 then $globe_rr = $rr
    }
}
return $globe_rr
Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump: