srun and openmpi not launching parallel jobsPostman not launching anymore in 18.04How to install openmpi...
Is this nominative case or accusative case?
Was it really inappropriate to write a pull request for the company I interviewed with?
Did Amazon pay $0 in taxes last year?
What is the meaning of option 'by' in TikZ Intersections
Can you run a ground wire from stove directly to ground pole in the ground
Why do we call complex numbers “numbers” but we don’t consider 2 vectors numbers?
Can a Mimic (container form) actually hold loot?
What does "rhumatis" mean?
What can I do if someone tampers with my SSH public key?
Are there other characters in the Star Wars universe who had damaged bodies and needed to wear an outfit like Darth Vader?
Convert an array of objects to array of the objects' values
Ultrafilters as a double dual
Named nets not connected in Eagle board design
How spaceships determine each other's mass in space?
Remove object from array based on array of some property of that object
An Undercover Army
Are small insurances worth it
Why won't the strings command stop?
Is "cogitate" an appropriate word for this?
The Key to the Door
Naming Characters after Friends/Family
How do you make a gun that shoots melee weapons and/or swords?
Under what conditions would I NOT add my Proficiency Bonus to a Spell Attack Roll (or Saving Throw DC)?
3.5% Interest Student Loan or use all of my savings on Tuition?
srun and openmpi not launching parallel jobs
Postman not launching anymore in 18.04How to install openmpi modules in Ubuntu 18.04?Canon Printer “Does not accept Jobs” on Ubuntu 18.04The Stanley Parable Demo not launching on Ubuntu 18.04ubuntu 18.04 system setting is not launchingmailspring snap not launchingSystem settings not launching on Ubuntu 18.04Nautilus not launchingSynaptic Package Manger is not launching in Ubuntu 18.04Ubuntu 18.04 - Brother Laser Printer - “does not accept jobs” - pending
I can't seem to run MPI jobs using slurm. Any help or advice?
Running a local home based mini cluster to use all of my processors. I'm using 18.04, and have installed the stock openmpi and slurm packages. I have a small test program that I use to show which cores I'm running on. When I run with mpirun I get:
$ mpirun -N 3 MPI_Hello
Process 1 on ubuntu18.davidcarter.ca, out of 3
Process 2 on ubuntu18.davidcarter.ca, out of 3
Process 0 on ubuntu18.davidcarter.ca, out of 3
When I run with srun I get:
$ srun -n 3 MPI_Hello
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
I've done this many times with different arguments (--mpi=pmi2, --mpi=openmpi, etc) and can confirm that instead of running a job with n parallel threads it runs n single threaded jobs. n times the work with 1/n times the expected resources per job.
This is my test program:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init (&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s, out of %dn", rank, processor_name, numprocs);
MPI_Finalize();
}
This is my /etc/slurm-llnl/slurm.conf:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=ubuntu18.davidcarter.ca
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
MpiParams=ports=12000-12100
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
#NodeName=compute[1-4] Sockets=1 CoresPerSocket=2 RealMemory=1900 State=UNKNOWN
#NodeName=compute[1-2] Sockets=1 CoresPerSocket=4 RealMemory=3800 State=UNKNOWN
NodeName=compute1 Sockets=8 CoresPerSocket=1 RealMemory=7900 State=UNKNOWN
NodeName=ubuntu18 Sockets=1 CoresPerSocket=3 RealMemory=7900 State=UNKNOWN
#
# Partitions
PartitionName=debug Nodes=ubuntu18 Default=YES MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=batch Nodes=compute1 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=prod Nodes=compute1,ubuntu18 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
18.04
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I can't seem to run MPI jobs using slurm. Any help or advice?
Running a local home based mini cluster to use all of my processors. I'm using 18.04, and have installed the stock openmpi and slurm packages. I have a small test program that I use to show which cores I'm running on. When I run with mpirun I get:
$ mpirun -N 3 MPI_Hello
Process 1 on ubuntu18.davidcarter.ca, out of 3
Process 2 on ubuntu18.davidcarter.ca, out of 3
Process 0 on ubuntu18.davidcarter.ca, out of 3
When I run with srun I get:
$ srun -n 3 MPI_Hello
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
I've done this many times with different arguments (--mpi=pmi2, --mpi=openmpi, etc) and can confirm that instead of running a job with n parallel threads it runs n single threaded jobs. n times the work with 1/n times the expected resources per job.
This is my test program:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init (&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s, out of %dn", rank, processor_name, numprocs);
MPI_Finalize();
}
This is my /etc/slurm-llnl/slurm.conf:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=ubuntu18.davidcarter.ca
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
MpiParams=ports=12000-12100
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
#NodeName=compute[1-4] Sockets=1 CoresPerSocket=2 RealMemory=1900 State=UNKNOWN
#NodeName=compute[1-2] Sockets=1 CoresPerSocket=4 RealMemory=3800 State=UNKNOWN
NodeName=compute1 Sockets=8 CoresPerSocket=1 RealMemory=7900 State=UNKNOWN
NodeName=ubuntu18 Sockets=1 CoresPerSocket=3 RealMemory=7900 State=UNKNOWN
#
# Partitions
PartitionName=debug Nodes=ubuntu18 Default=YES MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=batch Nodes=compute1 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=prod Nodes=compute1,ubuntu18 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
18.04
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I can't seem to run MPI jobs using slurm. Any help or advice?
Running a local home based mini cluster to use all of my processors. I'm using 18.04, and have installed the stock openmpi and slurm packages. I have a small test program that I use to show which cores I'm running on. When I run with mpirun I get:
$ mpirun -N 3 MPI_Hello
Process 1 on ubuntu18.davidcarter.ca, out of 3
Process 2 on ubuntu18.davidcarter.ca, out of 3
Process 0 on ubuntu18.davidcarter.ca, out of 3
When I run with srun I get:
$ srun -n 3 MPI_Hello
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
I've done this many times with different arguments (--mpi=pmi2, --mpi=openmpi, etc) and can confirm that instead of running a job with n parallel threads it runs n single threaded jobs. n times the work with 1/n times the expected resources per job.
This is my test program:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init (&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s, out of %dn", rank, processor_name, numprocs);
MPI_Finalize();
}
This is my /etc/slurm-llnl/slurm.conf:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=ubuntu18.davidcarter.ca
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
MpiParams=ports=12000-12100
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
#NodeName=compute[1-4] Sockets=1 CoresPerSocket=2 RealMemory=1900 State=UNKNOWN
#NodeName=compute[1-2] Sockets=1 CoresPerSocket=4 RealMemory=3800 State=UNKNOWN
NodeName=compute1 Sockets=8 CoresPerSocket=1 RealMemory=7900 State=UNKNOWN
NodeName=ubuntu18 Sockets=1 CoresPerSocket=3 RealMemory=7900 State=UNKNOWN
#
# Partitions
PartitionName=debug Nodes=ubuntu18 Default=YES MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=batch Nodes=compute1 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=prod Nodes=compute1,ubuntu18 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
18.04
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I can't seem to run MPI jobs using slurm. Any help or advice?
Running a local home based mini cluster to use all of my processors. I'm using 18.04, and have installed the stock openmpi and slurm packages. I have a small test program that I use to show which cores I'm running on. When I run with mpirun I get:
$ mpirun -N 3 MPI_Hello
Process 1 on ubuntu18.davidcarter.ca, out of 3
Process 2 on ubuntu18.davidcarter.ca, out of 3
Process 0 on ubuntu18.davidcarter.ca, out of 3
When I run with srun I get:
$ srun -n 3 MPI_Hello
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
Process 0 on ubuntu18.davidcarter.ca, out of 1
I've done this many times with different arguments (--mpi=pmi2, --mpi=openmpi, etc) and can confirm that instead of running a job with n parallel threads it runs n single threaded jobs. n times the work with 1/n times the expected resources per job.
This is my test program:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init (&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s, out of %dn", rank, processor_name, numprocs);
MPI_Finalize();
}
This is my /etc/slurm-llnl/slurm.conf:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=ubuntu18.davidcarter.ca
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
MpiParams=ports=12000-12100
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
#NodeName=compute[1-4] Sockets=1 CoresPerSocket=2 RealMemory=1900 State=UNKNOWN
#NodeName=compute[1-2] Sockets=1 CoresPerSocket=4 RealMemory=3800 State=UNKNOWN
NodeName=compute1 Sockets=8 CoresPerSocket=1 RealMemory=7900 State=UNKNOWN
NodeName=ubuntu18 Sockets=1 CoresPerSocket=3 RealMemory=7900 State=UNKNOWN
#
# Partitions
PartitionName=debug Nodes=ubuntu18 Default=YES MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=batch Nodes=compute1 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
PartitionName=prod Nodes=compute1,ubuntu18 Default=NO MaxTime=INFINITE State=UP OverSubscribe=NO
18.04
18.04
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 6 mins ago
David CarterDavid Carter
1
1
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
David Carter is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
David Carter is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1123930%2fsrun-and-openmpi-not-launching-parallel-jobs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
David Carter is a new contributor. Be nice, and check out our Code of Conduct.
David Carter is a new contributor. Be nice, and check out our Code of Conduct.
David Carter is a new contributor. Be nice, and check out our Code of Conduct.
David Carter is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1123930%2fsrun-and-openmpi-not-launching-parallel-jobs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown