Version 1.4

Version Warning

When using this document, take note of the last updated timestamp. If you're deploying a recent version and the last update of this document is months before such version, then some information may be missing - vice-versa is also true – but you can use the document history to find an older deployment guide.


For now,  message box will be updated to indicate the GOBii version for which this document was last updated.

GOBii Version

Version 1.4

UPDATE: As of ,  this page is cloned for each release version we have as the deployment process can vary significantly. To install any version prior to 1.4,  go through page history and search for the text "1.x" where x is the minor release version you're looking for. If you have trouble finding it, please contact Support Portal or send email to support@GOBiiproject.atlassian.net


Definition of Terms

  • Nodes = GOBii Nodes
    • The term "nodes" here will always refer to the GOBii nodes, which are Docker containers that can be deployed to different servers or virtual environments. Server nodes, on the other hand, will be explicitly called "server node".

Background

GOBii is made up of multiple modules and categorized according to functions.  A system diagram that shows these categories (by Docker container), the data flow, and the modules is available here.

Depending on your server topology, the instructions in this page may require some tweaking. For each sections with significant differences on steps depending on server topology, a "Note Box" like the one below will be written.


GOBii's deployment architecture is flexible and node-based. There are three main nodes: computation, database, and web. These nodes are now pre-baked into Docker images and can be deployed in their own server, VM, or in any combination of servers and virtual environments.

To give you an idea, here's an example topology and node-distribution:

Example Topology
Server 1:
	Server Head: GOBii Test (all nodes)
	Server Node1: GOBii Prod Database Node
	Server Node2: GOBii Prod Web Node
	Server Node3: GOBii Prod Compute Node


Current Limitation

You can put GOBii nodes of the same GOBii instance into one server, but we advise against mixing nodes of different GOBii instances into one server. Aside from competing for resources, there are potential conflict points that nodes from different instances may run into.


Prerequisites

  1. The official repository for the deployment scripts is here. Make sure you clone or download the scripts from there. The branch you should get is master (the default branch), unless on cases when you are deploying the non-release versions (ex. testing a release candidate, say, 1.5 RC).
  2. Finalize your topology and write it down, because if you are deploying all three GOBii nodes to just one server, you run a different script as opposed to when you deploy GOBii into one server per node or any other variations (you run three scripts).
  3. The servers should have the Docker engine version 17 and up installed. Make sure the servers have access to the Dockerhub site.
    1. Ubuntu: https://docs.Docker.com/engine/installation/linux/Docker-ce/ubuntu/#upgrade-Docker-ce-1
    2. CentOS: https://docs.Docker.com/install/linux/Docker-ce/centos/
  4. A mount point or a shared drive that all the nodes can access — this will be a volume mounted to all the three Docker containers.
  5. The user that will run the scripts needs to be a sudoer and under the GOBii and Docker groups. So preferably the user 'gadm'. But the username is arbitrary, it just needs to be consistent. You may find gadm sudoer used in the rest of this document. Note: the name is flexible.

    1. sudo usermod -aG Docker gadm
  6. (Optional) A directory where the data of postgres will reside in. The default will be Ubuntu's postgres directory in the DB Docker (ex. /var/lib/pgsql/data) which will be linked to Docker's default volume directory (ex. /usr/local/Docker/volumes/postgreslibubuntu)

GOBII TEST

For a test GOBii instance, you can use the vanilla version of the Dockers:


Deployment Steps

  1. Copy the deployment scripts and files from the cloned repository (prerequisite #1) to the shared drive (prerequisite #4).

    We update the param files from time to time (i.e. new features being added), so please don't just copy-paste the sample param files below. They are just shown in this page for reference. Instead, pull from our deployment scripts git repository for a particular release you are deploying (ex. release/1.4).

    The templates shown below was last updated for version 1.4.

  2. Edit the main parameter fileYou can find a template in the repository (GOBiideployment/params/template_main.parameter). This file should be placed in the same directory as the deployment scripts (as of 1.4, this is no longer the case). It will contain all the topology information and deployment credentials. The template is shown below, with each parameter explained on top of the corresponding line: 

    Go through this parameter file line by line. Make sure all the values are correct, specially on the names of the crops and the Docker images you want to deploy. Go to hub.Docker.com and look at the containers under our repository gadm01 to make sure you got the image name correctly.

    Default Credentials

    All the passwords and some usernames have been omitted on the parameter file templates in this page for security. Make sure you check Version 1.4 to replace the parameters with the correct values.

    #This file will be used by the_GOBii_ship_*.sh to deploy and configure the Docker images to target hosts.
    
    #This is your shared directory (will be mounted as volume to the Dockers), it needs to be accessible from the 3 Dockers
    BUNDLE_PARENT_PATH="/mnt/temp-GOBii"
    #The name of the first crop - this needs to match what is in the provisioned Docker image in our Docker hub.
    Docker_CROP1_NAME="maize"
    #The name of the second crop - this needs to match what is in the provisioned Docker image in our Docker hub.
    Docker_CROP2_NAME="wheat"
    #IP/Hostname of the Docker compute host
    Docker_COMPUTE_HOST="172.1.2.5"
    #Name of the compute Docker. This is more of an alias to let you access it conveniently.
    Docker_COMPUTE_NAME="GOBii_compute_cimmyt"
    #Port number that will be mapped to compute node's default SSH port. Make sure it's open and doesn't conflict with anything in the host server.
    Docker_COMPUTE_SSH_PORT="2222"
    #The number of minutes the instruction files age should be before the cronjobs pick them up. This needs to be prefixed by '+', which means 'pick up the files that are greater than n minutes'.
    Docker_CRON_FILE_AGE="+2"
    #The number of minutes between each cron jobs execution.
    Docker_CRON_INTERVAL="2"
    #IP/Hostname of the Docker DB host
    Docker_DB_HOST="172.1.2.4"
    #Name of the database Docker. This is more of an alias to let you access it conveniently.
    Docker_DB_NAME="GOBii_db_cimmyt"	
    #Port number that will be mapped to the DB Docker's 5432 port for postgres connection. Make sure it's open and doesn't conflict with anything in the host server.
    Docker_DB_PORT="5433"
    #IP/Hostname of the Docker DB host
    Docker_WEB_HOST="172.1.2.3"
    #Name of the web Docker. This is more of an alias to let you access it conveniently.
    Docker_WEB_NAME="GOBii_web_cimmyt"
    #Port number that will be mapped to port 8080 in the web Docker (the Tomcat default port). Make sure it doesn't conflict with anything in the host.
    Docker_WEB_PORT="8081"
    #The group ID of the 'GOBii' group in the host machine. The name can be arbitrary, ex. 'icrisat-GOBii', as long as this GID corresponds correctly to it. The 'GOBii' group in all the 3 Dockers will be linked to it.
    #Sample command to get it: getent group GOBii (then take the first number) - this depends on your host's OS
    GOBii_GID="12345"
    #The user ID of the 'gadm' sudoer in the host machine. The name can be arbitrary, ex. 'icrisat-gadm', as long as this UID corresponds correctly to it. The 'gadm' user in all the 3 Dockers will be linked to it.
    #Sample command to get it: getent passwd gadm (then take the first number) - this depends on your host's OS
    GOBii_UID="123456789"
    #As of 1.4, this parameter is passed directly, hence this line is ignored. OBSOLETE: File name of the parameter file that will be used for the GOBii instance's configuration once installed. The GOBii-web.xml file will be generated based on the values in this parameter file. 
    CONFIGURATOR_PARAM_FILE="template_install.parameters"
    #The gadm password inside the Docker containers, you can find the actual password in a restricted page "Defaut Credentials" in GOBii confluence.
    Docker_GOBii_ADMIN_PASSWORD="changeme"
    #Name of the compute Docker repository under GOBii's account (gadm01) you want to pull from.
    Docker_HUB_COMPUTE_NAME="GOBii_compute_cimmyt"
    #Name of the DB Docker repository under GOBii's account (gadm01) you want to pull from.
    Docker_HUB_DB_NAME="GOBii_db_cimmyt"
    #Name of the web Docker repository under GOBii's account (gadm01) you want to pull from.
    Docker_HUB_WEB_NAME="GOBii_web_cimmyt"
    #Username of the Docker hub account you want to use. Change accordingly. If you get permission issues
    Docker_HUB_USERNAME="gadm01"
    #Name of the sudoer account in the HOST server that the Dockers' gadm account will correspond to (it can be named differently as long as their UIDs match)
    Docker_SUDOER_USERNAME="i-am-super"
    #The name of the GOBii application data bundle. Keep the default unless otherwise changed in the Docker images.
    Docker_BUNDLE_NAME="GOBii_bundle"
    #Internally used by the Dockers. Keep the default unless otherwise changed in the Docker images.
    BUNDLE_TEMP_PATH="/var/GOBii_bundle"
    #Postgres Volumes Path -- No need to change this, just create a symlink from the Docker volume directory to where you want postgres datafiles to reside. Because as of now, changing these volumes path from here hasn't been tested.
    POSTGRES_etc.="GOBiipostgresetc.ubuntu"
    POSTGRES_LOG="GOBiipostgreslogubuntu"
    POSTGRES_LIB="GOBiipostgreslibubuntu"
    
    
    #activate/decativate encryption
    ACTIVATE_ENCRYPTION="false"
    
    #Docker KDC Node name as it will appear on host
    Docker_KDC_NAME="GOBii_kdc_node"
    #KDC Image name as it appears in Docker hub
    Docker_HUB_KDC_NAME="GOBii_kdc_ubuntu"
    #KDC file storage on host. Assumption is this dir lies with GOBii_parent and accessible via "/data" symlink
    #This value should match with the param kdcompute.working-directory in kdc-application.properties in kdc Docker image
    KDC_FILE_STORAGE_DIR="kdcompute_file_storage"
    
    ###Post 1.2 params starts here###
    
    #Liquibase contexts
    #This handles the migration path and the seed data -- depending on the flavor of GOBii that you are trying to deploy. A quick summary of what these contexts are:
    #1. general = this context contains schema changes (ie. dropped columns, new tables, dropped tables, etc.)
    #2. seed_general = this context contains the basic seed data. It is mainly for controlled vocabularies and ontologies, ie. all seed data that needs to exist in all clients' databases.
    #3. seed_crop1 and seed_crop2 = these are example crop-specific seed contexts. You will need to ensure that the context you are using exists. New clients will have specific contexts created for them.
    #   These contexts contain contact information (ie. list of GOBii users for that instance, along with their usernames, email, and roles). Anything seed-data related that are specific to certain crops goes to this context.
    LIQUIBASE_CROP1_CONTEXTS="general,seed_general,seed_crop1"
    LIQUIBASE_CROP2_CONTEXTS="general,seed_general,seed_crop2"
    
    

    You can name this file however you want. The full file path is passed to the deployment script.

    For anything not clear or if you're not sure of what to put on a parameter's value, please ask 00d483e952a2545b0152a4b36870000e.

    If a seed context for your crop is not available and you would like to have one (ex. seed_crop3), please contact 00d483e961e2874401621ac073490001 or 00d483e952a2545b0152a4b36870000e.

  3. Edit the install parameter file. You can find a template in the repository (GOBiideployment/params/template_install.parameter). This file should be placed in the same directory as the deployment scripts (as of 1.4, this is no longer the case). It will contain all the GOBii instance's configuration (ie. runtime configuration via the GOBii-web.xml content). The template is shown below, with each parameter explained on top of the corresponding line:

    #This parameter file will be used by GOBiiconfig_wrapper.sh to generate a proper GOBii-web.xml -- the main configuration file of a GOBii instance.
    #Note that you need to specify this file into the *_main.parameters so that the deployment script will pick it up.
    #Also note that all the paths here are on the point of view of the running Dockers, mainly the web Docker, hence the /data prefix on most of them.
    #The /data directory is the default working directory of all the GOBii Dockers, this is also where the application bundle is located. It is a volume mapped to the shared directory visible to all 3 Dockers.
    
    #The GOBii application data bundle's path in the context of the Docker containers. Keep the default unless otherwise changed in the Docker images.
    BUNDLE_PATH="/data/GOBii_bundle"
    #The generated GOBii-web.xml path. Keep the default unless otherwise changed in the Docker images.
    CONFIG_XML="/data/GOBii_bundle/config/GOBii-web.xml"
    #The authentication type. If you want to be able to login using the test user upon installation, set this to "TEST". If you want to immediately connect to LDAP upon installation, set to "LDAP".
    #We suggest to set it to "TEST" at first, then verify that everything works by logging into the extractor UI. Then manually set it to "LDAP" in the GOBii-web.xml file, then restart Tomcat and you're all set.
    AUTH_TYPE="TEST"
    #The LDAP distinguised name
    LDAP_DN="uid={0}"
    #The LDAP URL
    LDAP_URL="ldaps://test.cornell.edu/ou=people,dc=testl,dc=testnet"
    #The LDAP Bind User
    LDAP_BIND_USER="uid=GOBii-user"
    #The LDAP Bind User's password
    LDAP_BIND_PASSWORD="dummypass"
    #The LDAP background user -- the webservices use this to query for valid users, etc.
    LDAP_BACKGROUND_USER="GOBii-user"
    #The LDAP backround user's password
    LDAP_BACKGROUND_PASSWORD="dummypass"
    #The mail host. This can be your local mail host or a Gmail SMTP server.
    MAIL_HOST="smtp.gmail.com"
    #The mail server's port.
    MAIL_PORT=587
    #The mail user account.
    MAIL_USERNAME="GOBii.test@gmail.com"
    #The mail user account's password.
    MAIL_PASSWORD="dummypass"
    #The type of mail server, Gmail defaults to SMTP.
    MAIL_TYPE="SMTP"
    #The mail server's hash
    MAIL_HASH="na"
    #The name of the first crop. This should match what is in the Docker hub images.
    CROP1="maize"
    #The name of the first crop. This should match what is in the Docker hub images.
    CROP2="wheat"
    #The web host's domain name or IP address
    WEB_HOST="172.1.2.3"
    #The web host's web server port. This should match what is in the *_main.parameter file.
    WEB_PORT="8081"
    #The first crop's context path (the web application). This should match what is in the Docker hub images.
    CROP1_CONTEXT_PATH="/GOBii-maize"
    #The second crop's context path (the web application). This should match what is in the Docker hub images.
    CROP2_CONTEXT_PATH="/GOBii-wheat"
    #The database host's domain name or IP address
    DB_HOST="172.1.2.4"
    #The database host's postgres port. This should match what is in the *_main.parameter file.
    DB_PORT="5433"
    #The main GOBii database user
    DB_USERNAME="dummyuser"
    #The main GOBii database user's password. If you want to change this password, you can do so once the whole automated deployment finish successfully. But make sure after changing it in postgres, change the corresponding tag in the GOBii-web.xml file.
    DB_PASS="dummypass"
    #The database name of the first crop. This should match what is in the Docker hub images.
    DB_NAME_CROP1="GOBii_maize"
    #The database name of the second crop. This should match what is in the Docker hub images.
    DB_NAME_CROP2="GOBii_wheat"
    #KDCompute related params used by GOBiiconfig jar
    #KDcompute host
    KDC_HOST="kdchost"
    #KDcompute port
    KDC_PORT="8083"
    #kdcompute application context on server
    KDC_CONTEXT_PATH="kdcompute"
    #KDcompute job start param
    KDC_JOB_START="qcStart"
    #KDcompute job status param
    KDC_JOB_STATUS="qcStatus"
    #KDcompute job downloand
    KDC_JOB_DOWNLOAD="qcDownload"
    #Seconds to wait between status checks
    KDC_JOB_CHECK_STATUS="60"
    #Minutes until job is hung
    KDC_JOB_FAIL_THRESHOLD="2880"
    #KDcompute purge the jobs
    KDC_PURGE="qcPurge"
    #KDcompute is active (false|true) for GOBii 
    KDC_ACTIVE="false"
    
    
    


    You can name this file however you want, just as long as you specify the correct name in the CONFIGURATION_PARAM_FILE value of main.parameter file in step 3.

    As of version 1.3, the *install.parameter file is also passed as a parameter to the main call to the GOBii_ship scripts. Hence, it is not necessary to be set in the CONFIGURATOR_PARAM_FILE of the *main.parameter file.

    For anything not clear or if you're not sure of what to put on a parameter's value, please ask 00d483e952a2545b0152a4b36870000e.


  4. Run the deployment scripts
    1. If you are deploying GOBii into just one machine, you run the_GOBii_ship.sh to pull, deploy, and configure all three Docker containers to one target server. To do so, run a command similar to:

      #Usage: bash.sh the_GOBii_ship.sh <path-of-main-param-file> <path-of-install-param-file> <Dockerhubpassw | askme> <GOBii_release_version>
      #Set Dockerhubpassw parameter to 'askme' for the script to prompt for password instead.
      
      bash the_GOBii_ship.sh params/template_main_<CG Center File Name>.parameters params/template_install_<CG Center File Name>.parameters askme release-1.4-7

      No Sudo

      This script should not be run using sudo or as the root user. Some commands will automatically prompt you if they need elevated permissions.

      Troubleshooting

      If you made a mistake and want to start over or if there are other Dockers in the server you want to get rid of, do a cleanup by running Docker stop, rm, and rmi.

    2. If you are deploying GOBii to more than one machine, you run three scripts: the_GOBii_ship_db.sh, the_GOBii_ship_web.sh, and the_GOBii_ship_compute.sh – strictly in that order.

      Topology Differences

      Regardless of how the GOBii nodes are distributed, you still have to run these three scripts sequentially on their respective servers.

      For example, if you have two servers (servers A and B), and you want to put both the DB and COMPUTE nodes to server A then the WEB node to server B, then you run the_GOBii_ship_db.sh in server A first, then go to server B and run the_GOBii_ship_web.sh, finally go back to server A and run the_GOBii_ship_compute.sh.


      1. Run the modularized scripts exactly the same way you would run the one-server script in 4.a.:

        #Usage: bash.sh the_GOBii_ship_<node>.sh <path-of-parms-file> <Dockerhubpassw | askme> <GOBii_release_version>
        #Set Dockerhubpassw parameter to 'askme' for the script to prompt for password instead.
        
        
        #go to DB server and run
        bash the_GOBii_ship_db.sh GOBii_main.parameters askme 1.0-Cascadilla
        #go to WEB server and run
        bash the_GOBii_ship_web.sh GOBii_main.parameters askme 1.0-Cascadilla
        #go to COMPUTE server and run
        bash the_GOBii_ship_compute.sh GOBii_main.parameters askme 1.0-Cascadilla


        Here you see the importance of having the scripts and the param files together in a directory that's accessible to all servers.

  5. After running the deployment scripts and doing the verification step #1 below, turn on LDAP authentication, if it wasn't already. (details on verification step #2). When turning on LDAP, make sure that the LDAP certificate is loaded to the JVM. You can do so by:
    1. #########
      #run the install cert
      #Usage: bash install_cert.sh </data/mycert.der> <ldap_host> </usr/local/lib/keytool> </usr/local/jdk/jre/lib/cacerts> <changeit>
        
      Docker exec GOBii-web-node bash -c '
      cd /data/GOBii_bundle/config;
      bash install_cert.sh /data/cacert_mgs1.der cbsu_mgs1 /usr/lib/jvm/java-8-oracle/bin/keytool /usr/lib/jvm/java-8-oracle/jre/lib/security/cacerts changeit;
      '
      #########

      You should see a confirmation message saying "certificates added to keystore". Finally, restart Tomcat, making sure it runs via user gadm:

      Docker exec GOBii-web-node bash -c '
      sudo su gadm;
      sh /usr/local/tomcat/bin/shutdown.sh;
      sh /usr/local/tomcat/bin/startup.sh;
      '

      As you can see, /data/cacart_mgs1.der is the certificate file. The command above looks for it in the web Docker home volume, which is /data. So wherever that volume points to in the host server, make sure you put the DER file there first before running the command. Lastly, the paths to keytool and cacerts will most likely stay the same, as we're only distributing Dockers based on Ubuntu, but in case in the future we offer other linux flavor, or JVM changes, then those paths may change.

  6. Make sure that the ports you assigned the Dockers with (typically 8081, 5433, and 2222) are open. Otherwise the containers won't be able to communicate with each other and will fail with internal server error 500 (although in the future we may have more specific error message than this). The more specific error message will be shown in Tomcat's log (catalina.out). Opening a port will differ from OS to OS. 
    1. Just to give you an example, in CentOS 6 and 7:

      $ sudo iptables -I INPUT -p tcp -m tcp --dport <port_number> -j ACCEPT
      $ sudo service iptables save
  7. Make sure that the /data symlink in the web node was created for the LoaderUI to work properly. If not, do the following:
    1. If run in the same terminal session you ran GOBii_ship*.sh scripts, then run this as is, the $BUNDLE_PARENT_PATH variable should be there. If not, replace it with that parameter's value in the *_main.parameters file that you have.

      /data symlink
      sudo ln -sfn $BUNDLE_PARENT_PATH /data 

      If it is not possible for the /data symlink to be created because in the target host, /data is already a directory or a drive mount point, create a symlink manually from /data/GOBii_bundle to point to $BUNDLE_PARENT_PATH/GOBii_bundle, effectively still making /data/GOBii_bundle point to the correct location.

Verification Steps

  1. Verify that the deployment was successful and that all Docker containers can communicate with each other. You can do so by a simple test:
    1. Go to the URL of the ExtractorUI for each of your crop. For example, http://yourwebnodeip.or.hostname:8081/GOBii-wheat/. You should see a login page:

      1. If you set "TEST" as the value to the authentication protocol in the earlier param file, you can use the test user here, USER_READER. If not, and your LDAP is connected, you use a valid user credential in your LDAP system.
    2. Click login. You should see the extractorUI page:
    3. Simply select any value for the Principal Investigator, Project, and Experiment (or just leave them at the defaults). Make sure the data sets checkbox is ticked. Also, make sure the principal investigator drop-down gets populated with names.
    4. Click submit.
    5. Verify that the "Status Message" box shows the following:
    6. If you have more than one crop, try navigating to the other crop by the crop drop-down:
    7. Congratulations! This verifies that your Dockers are all running and communicating with each other.

      Rationale

      Web Docker is working as the login to the web application goes through, even with the test user USER_READER.

      Database Docker is working as the drop-down controls in the extractor UI get populated with values.

      Compute Docker is working as the "Status Message" box didn't show any errors when you submitted a job.

      IP addresses/Host names are correct as you were able to navigate from one crop to another.

  2. Verify that your LDAP is connected properly to GOBii.

    1. To turn on LDAP authentication, do the following:
      1. Go to web Docker's host server
      2. Navigate to /data/GOBii_bundle/config/
      3. Open GOBii-web.xml for editing
      4. Change the value of GOBiiAuthenticationType to LDAP, ie. <GOBiiAuthenticationType>LDAP</GOBiiAuthenticationType>
      5. Save the file
      6. Restart the web server via:

        Restarting the web server
        Docker exec -ti -u gadm GOBii-web-node bash
        sh /usr/local/tomcat/bin/shutdown.sh
        sh /usr/local/tomcat/bin/startup.sh

        Make sure Tomcat always run under the Docker user gadm. This ensures that files created via the web server will always get proper group permissions.

    2. Verify that LDAP is working by doing everything in verification step #1, only this time using a valid LDAP user instead of the user USER_READER.
    3. You should receive an email like the one below:

      This is expected to fail as you don't have any data to extract yet, but you have verified that everything's working, including the mailer.

  3. Verify that the LoaderUI is working by loading valid files from your projects.


Cheers!

Cheers! You are done deploying GOBii and verifying that everything works!

Note that we are working on a configurator CLI that will allow you to reconfigure a running GOBii instance (and add an option of using it directly instead of the param files) as we want to avoid direct modifications of the GOBii-web.xml.