wiki:docs_openstack

running modules on Jetstream with OpenStack clusters

  • experimental and in genappalpha rev > 1157

background

  • We create virtual clusters dynamically for jobs on Jetstream with OpenStack commands.
  • This allows dedicated mpi jobs to be run.
  • The module's executable will start on one of the virtual nodes and is responsible for running the mpi job.

prerequisites

  • this only currently works on Jetstream with a Centos 7.2 host with the openstack client installed
  • in the module json, "resource" : "oscluster" to use the openstack control for this module
  • the module must provide an input field with "id" : "numprocs" , this will be used to determine the number of virtual nodes to create
  • appconfig.json also needs to be properly setup
    • the all CAPS entries need to be changed to the correct values for the user
         "resources" : {
              "local" : ""
              ,"oscluster" : {
                  "run" : "oscluster"
                  ,"properties" : {
                      "flavor"         : "m1.medium"
                      ,"ppn"           : 6
                      ,"flavors"       : {
                          "m1.tiny"     : 1
                          ,"m1.small"   : 2
                          ,"m1.medium"  : 6
                          ,"m1.large"   : 10
                          ,"m1.xlarge"  : 24
                          ,"m1.xxlarge" : 44
                      }
                      ,"baseimage"     : "3bc1c526-1724-43ac-a89f-99c92fca130d"
                      ,"key"           : "apache"
                      ,"secgroup"      : "sshlocalall"
                      ,"project"       : "XSEDE_PROJECT_ID"
                      ,"user"          : "TACC_PORTAL_USER"
                      ,"password"      : "TACC_PORTAL_USER_PASSWORD"
                      ,"domain"        : "tacc"
                      ,"api_version"   : 3
                      ,"auth_url"      : "https://jblb.jetstream-cloud.org:35357/v3"
                      ,"user_data"     : "mount 10.0.0.7:/opt /opt -o rw,noatime,nodiratime,async 2>&1 > /tmp/userdata.log;service iptables stop 2>&1 > /tmp/servicestop.log;touch /tmp/ready"
                  }
              }
          },
      
  • note that "flavor" and "ppn" define defaults for oscluster jobs, so by default each virtual cluster node created can be over- or under-loaded.
  • directives.json needs an extra value
      "docrootactual" : {
           "html5" : "/opt/genapp/APPNAME/output/html5"
      },
    
    • where APPNAME is replaced by the application name
    • this is different from "directives" : { "docroot" : "html5" } , which typically lists the www directory without the APPNAME

module json

  • in your module json, you will need to enable input of "numprocs" value, e.g.
            ,{
                "role"     : "input",
                "id"       : "numproc",
                "label"    : "Number of Processors",
                "type"     : "integer",
                "step"     : 1,
                "min"      : 1,
                "max"      : 8,
                "default"  : 1,
                "required" : "true",
                "help"     : "Select the number of processors"
            }
    
  • where "step" and "min" should be a multiple of the number of processors per node
  • the number of virtual nodes required will be automatically computed
  • for advanced "fine grain" user control, this can be added:
            ,{
                "role"     : "input",
                "id"       : "apc",
                "label"    : "Advanced process control",
                "type"     : "checkbox",
                "default"  : "off",
                "repeater" : "true",
                "help"     : "Set specific parameters for flavor of virtual nodes, number of process to use per node and number of nodes to use.<p>N.B. Setting these overrides the Number of Processors selected above."
            }
            ,{
                "role"     : "input",
                "id"       : "os_flavor",
                "label"    : "Flavor",
                "type"     : "listbox",
                "values"   : "tiny 1 ppn~m1_tiny~small 2 ppn~m1_small~medium 6 ppn~m1_medium~large 10 ppn~m1_large~xlarge 24 ppn~m1_xlarge~xxlarge 44 ppn (full physical node)~m1_xxlarge",
                "default"  : "m1_xlarge",
                "repeat"   : "apc",
                "help"     : "Select the flavor for virtual nodes.  This will determine the number of physical processors available per node (ppn)."
            }
            ,{
                "role"     : "input",
                "id"       : "os_ppn",
                "label"    : "Use processors per node",
                "type"     : "integer",
                "default"  : 24,
                "min"      : 1,
                "max"      : 44,
                "required" : "true",
                "repeat"   : "apc",
                "help"     : "Select the number of processors to utilize per node.<p>N.B. If this is greater than the ppn for the flavor defined above, the processors will be overloaded and efficiency will likely suffer."
            }
            ,{
                "role"     : "input",
                "id"       : "os_nodes",
                "label"    : "Number of nodes to use",
                "type"     : "integer",
                "min"      : 1,
                "default"  : 1,
                "max"      : 6,
                "required" : "true",
                "repeat"   : "apc",
                "help"     : "Select the number of nodes to allocate."
            }
            ,{
                "role"     : "input",
                "id"       : "os_total_procs",
                "label"    : "Processors to use",
                "type"     : "float",
                "calc"     : "os_nodes*os_ppn",
                "readonly" : "true",
                "repeat"   : "apc",
                "help"     : "The computed total number of processors to use."
            }
    

executable

  • the executable will be given, in the input json, three new variables
    • "_clusternodecount" : N which is the number of virtual nodes created
    • "_clusterips" : [ ... ] which is an array of the ip addresses of the virtual nodes
    • "_clusteripsjoined" : "IP1,IP2,..." which is a string of the ip addresses joined with a comma
    • "_clusterhostfile" : "filename" has the filename of the hostfile in openmpi 1.10.4 or greater format

command line management utils

  • in GENAPP/languages/html5/openstack/util are management programs that can be run from the command line
    • you will need to copy these into a directory and also have an appropriate appconfig.json file there
      • php os_cluster_cli.php #-of-nodes name will start up a virtual cluster
      • php os_delete_cli.php name will delete a virtual cluster
      • php os_status_cli.php name will show current instances
        • note the "name" appears in the list in a form like this: TG-MCB140255-run-1f8231c0-89b9-11e6-9122-01d1fdcfd369-000
          • the name used in this case to delete would be 1f8231c0-89b9-11e6-9122-01d1fdcfd369
      • php os_integrity_cli.php will check the virtual clusters against "running" jobs registered and the php processes running them.
Last modified 11 months ago Last modified on Mar 18, 2017, 5:44:44 AM