Post Reply 
 
Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
dexer
01-23-2017, 08:41 PM
Post: #21
RE: dexer
(01-22-2017 09:37 PM)lingu Wrote:  
(01-22-2017 04:12 PM)zma Wrote:  
(01-22-2017 03:10 PM)YU_Xinjie Wrote:  We may consider to make the "return non-zero when the cmd on any node fails" to be the default behaviour of dexer.
It is usually helpful to exit as soon as possible when fails.
If dexer is run as root, it is even more important to stop when fails.

An implementation can be referred is the wait_errexit function in glad-common.sh.

@Zhiqiang @Wentao @lingu

What is your opinion?

Dying early and reporting early sounds a good idea to me as the default behavior.

If the users want to tolerate the failures in dexer, there are many way to be used.

In distributed execution ("-p"), a scatter-gather is better than strict early termination which requires preemptive killing.

In any case, I think returning non-zero and early termination are two issues and in this case we handle returning non-zero only. It is fine that, if there is any failure from any scatter task on any node, dexer return non-zero.

Agreed. This is exactly what I want to say and how wait_errexit() of glad works.
Will propose a detailed design later.
Find all posts by this user
Quote this message in a reply
01-24-2017, 10:52 AM
Post: #22
RE: dexer
(01-23-2017 08:41 PM)YU_Xinjie Wrote:  
(01-22-2017 09:37 PM)lingu Wrote:  
(01-22-2017 04:12 PM)zma Wrote:  
(01-22-2017 03:10 PM)YU_Xinjie Wrote:  We may consider to make the "return non-zero when the cmd on any node fails" to be the default behaviour of dexer.
It is usually helpful to exit as soon as possible when fails.
If dexer is run as root, it is even more important to stop when fails.

An implementation can be referred is the wait_errexit function in glad-common.sh.

@Zhiqiang @Wentao @lingu

What is your opinion?

Dying early and reporting early sounds a good idea to me as the default behavior.

If the users want to tolerate the failures in dexer, there are many way to be used.

In distributed execution ("-p"), a scatter-gather is better than strict early termination which requires preemptive killing.

In any case, I think returning non-zero and early termination are two issues and in this case we handle returning non-zero only. It is fine that, if there is any failure from any scatter task on any node, dexer return non-zero.

Agreed. This is exactly what I want to say and how wait_errexit() of glad works.
Will propose a detailed design later.

To be clear, what I meant by "die early" is: if there are errors, stop invoking further more tasks, to make things simple. This is what wait_errexit() can be used for well.
Visit this user's website Find all posts by this user
Quote this message in a reply
05-31-2017, 11:40 AM
Post: #23
RE: dexer
@Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...
Find all posts by this user
Quote this message in a reply
05-31-2017, 11:57 AM
Post: #24
RE: dexer
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

BTW: it's strange bin's `make install` calls `sudo`. `make install` is supposed to do non-interactive installation while sudo may block it. Simple way is to require `make install` be run by root as in our common way.
Visit this user's website Find all posts by this user
Quote this message in a reply
06-02-2017, 01:43 PM
Post: #25
RE: dexer
(05-31-2017 11:57 AM)zma Wrote:  
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

@Zhiqiang
It is implemented but I find Sage/GLAD would report some warning/error message about tty when I am running test on gm55.
I will roll back the '-t' change first, and check it later.
Find all posts by this user
Quote this message in a reply
06-02-2017, 03:10 PM
Post: #26
RE: dexer
(06-02-2017 01:43 PM)YU_Xinjie Wrote:  
(05-31-2017 11:57 AM)zma Wrote:  
(05-31-2017 11:40 AM)YU_Xinjie Wrote:  @Zhiqiang

Driving example:
When I am trying to update the bin tool on gm55 by dexer, I find it complains:
Code:
[root@gm55a-2 bin]# dexer 'cd /thinker/net/bin; make install'
10.6.1.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.2.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.3.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.4.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.5.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.6.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

10.6.7.2: cd /thinker/net/bin; make install
Installing to /thinker/bin (run with sudo)
sudo: sorry, you must have a tty to run sudo
make: *** [install] Error 1

I suggest we add a '-t' option into the ssh of dexer.
Please review.

Quote:...
ssh -t $username@$ip "$command"
...

Sounds good.

@Zhiqiang
It is implemented but I find Sage/GLAD would report some warning/error message about tty when I am running test on gm55.
I will roll back the '-t' change first, and check it later.

OK. I suggest removing the `sudo` from bin's Makefile for this case. Please take a look.
Visit this user's website Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump: