Version 2.1
You can find the release notes for every releases of GOBii in this page: System Requirements and Release Notes#menu-link-content |
VersioningWhen using this document, make sure that you are deploying the correct GOBii version number. The official build string for GOBii version 2.0 is below (a parameter string you need to run the shell scripts): release-2.1 Operating System, bash & Docker VersionsThe following are the versions used when developing and testing within GOBii Operating Systems:
Bash Version:
Docker Version:
git Version
GDM Deployment VersionsThis shows the Docker versions used for deployment of this release:
For any questions or clarifications, please contact Kevin Palis or Roy Petrie. |
IntroductionIn this section the Definition of Terms, Background and a brief overview of GOBii is described Definition of Terms
BackgroundGOBii is made up of multiple modules and categorized according to functions. A system diagram that shows these categories (by Docker container), the data flow, and the modules is available here. Depending on your server topology, the instructions on this page may require some tweaking. For each sections with significant differences on steps, depending on server topology, a "Note Box" like the one below will be written.
To give you an idea, here's an example topology and node-distribution: Server 1:
Server Head: GOBII Test (all nodes)
Server Node1: GOBII Prod Database Node
Server Node2: GOBII Prod Web Node
Server Node3: GOBII Prod Compute Node
|
Initial Installation Prerequisites
1.The official repository for the deployment scripts is here. Make sure you clone or download the scripts from there. The branch you should get is release/<version> (ex. release/2.1). You can also get the master branch if you are deploying the latest, but because our clients can have varying versions on different servers, all release branches are kept. 2.Finalize your topology and write it down. Because if you are deploying all 3 GOBii nodes to just one server, you run a different script as opposed to when you deploy GOBii into one server per node or any other variations (you run 3 scripts). 3.The servers should have the docker engine version 17 and up installed. Make sure the servers have access to the dockerhub site. -Ubuntu: https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/#upgrade-docker-ce-1 -CentOS: https://docs.docker.com/install/linux/docker-ce/centos/ -A mount point or a shared drive that all the nodes can access — this will be a volume mounted to all the 3 docker containers. 4.The user that will run the scripts needs to be a sudoer and under the ‘gobii’ and ‘docker groups’. So preferably the user ‘gadm'. But the username is arbitrary, it just needs to be consistent. You may find 'gadm sudoer’ used in the rest of this document, just note that the name is flexible. sudo usermod -aG docker gadm
For a test GOBii instance, you can use the vanilla version of the dockers: |
Non-Destructive Deployment [NDD]This section is for showing the non-destructive deployment architecture. This architecture needs to be implemented prior to any deployment post version 2.0 NDD DiagramThis is a diagram that represents the current architecture of the directories and the symlink In this architecture the /data/gobii_bundle directory will be destroyed during a new deployment and replaced with the latest version. Due to the persistent data being symlinks these directories are saved in the /storage/persistent_data directory. |
BackupsThis section is for existing instances that already have data in storage. This will show the process used to back up the existing data.
During release 2.1 the system has been developed for non-destructive deployments using symlinks to access data. Though it is still very important to perform data backups the restore process described later in this document has been deprecated.
|
DeploymentThis section will detail the scripts, parameters and process to deploy GOBii Deployment Scripts and Parameters Copy the deployment scripts and files from the cloned repository (prerequisite #1) to the shared drive (prerequisite #4). We update the param files from time to time (i.e., new features being added), so please don't just copy-paste the sample param files below. They are just shown in this page for reference. Instead, pull from our deployment scripts git repository for a particular release you are deploying (ex. release/1.5). The templates shown below were last updated for version 2.0 Edit the main parameter file. You can find a template in the repository (gobiideployment/params/template_main.parameter). It will contain all the topology information and deployment credentials. The template is shown below, with each parameter explained on top of the corresponding line: All the passwords and some usernames have been omitted on the parameter file templates in this page for security. Make sure you check Default Credentials [CONFIDENTIAL] to replace the parameters with the correct values. If you can't access the page with the default credentials, contact Kevin Palis or Roy Petrie. Version 2.0+; dockerhub access As of version 2.0 the container repos still exist under the user gadm01 but you cannot uploaded to it. This was done for security and maintenance purposes. Please use gadmreader to pull an images from the gadm01 account. template_main.parametersThis template has been updated with the most recent parameters as of Version 2.0. For ease of use, the template has been expanded with 'white' space between parameters to allow for a more readable and more easily editable structure. Additionally, this file has taken the most recent parameters added at the bottom for any version deployment beyond the last so as to allow for easy copy and paste into existing parameter files. As of version 2.0 any password set within the *main.parameters file set to "askme" have been configured to request the user password during script deployment. The request and password is hidden during deployment to keep the visibility and clear text passwords to a minimum. If a password is set, the script will continue without prompting for the user pass. You can name this file however you want. The full file path is passed to the deployment script. For anything not clear or if you're not sure of what to put on a parameter's value, please ask Kevin Palis. If a seed context for your crop is not available and you would like to have one (ex. seed_crop3), please contact Roy Petrie or Kevin Palis. Edit the ‘install’ parameter file. You can find a template in the repository (gobiideployment/params/template_install.parameter). It will contain all the GOBii instance's configuration (i.e. runtime configuration via the gobii-web.xml content). The template is shown below, with each parameter explained on top of the corresponding line: template_install.parametersFor ease of use the template has been expanded with 'white' space between parameters to allow for a more readable and more easily editable structure. As of version 1.3, the *install.parameter file is also passed as a parameter to the main call to the gobii_ship scripts. Hence, it is not necessary to be set in the CONFIGURATOR_PARAM_FILE of the *main.parameter file. For anything not clear or if you're not sure of what to put on a parameter's value, please ask Kevin Palis. Running the Deployment ScriptThis script should not be run using sudo or as the root user. Some commands will automatically prompt you if they need elevated permissions. If you are deploying GOBii into just one machine, you run ‘the_gobii_ship.sh’ to pull, deploy, and configure all 3 docker containers to one target server. To do so, you run a command similar to: If you made a mistake and want to start over or if there are other dockers in the server you want to get rid of, do a cleanup by running docker stop, rm, and rmi. The GOBii Ship...As of version 2.0 it is only required to run one script. This deployment script will be updated if new containers need to be deployed along side of GDM but now has the ability to deploy full deployment and call each script in the proper order for deployment or each container individually. This was updated to make sure the scripts did not need to be maintained in two places when configuration or updates where done to pre-existing containers. On deployment vim is now installed on web, db, compute and kdc nodes. This will eventually be built into the dockerhub containers in the repo but for the time being they will get vim the traditional way. LDAP Install CertAfter running the deployment scripts and doing the verification step #1 below, turn on LDAP authentication, if it wasn't already. (details on verification step #2). When turning on LDAP, make sure that the LDAP certificate is loaded to the JVM. You can do so by: You should see a confirmation message saying "certificates added to keystore". Finally, restart Tomcat, making sure it runs via user gadm: As you can see, /data/cacart_mgs1.der is the certificate file. The command above looks for it in the web docker home volume, which is /data. So, wherever that volume points to in the host server, make sure you put the DER file there first before running the command. Lastly, the paths to keytool and cacerts will most likely stay the same as we're only distributing dockers based on Ubuntu, but in case in the future we offer other linux flavor, or JVM changes, then those paths may change. Make sure that the ports you assigned the dockers with (typically 8081, 8083, 8084, 5433, and 2222) are open. Otherwise the containers won't be able to communicate with each other and will fail with internal server error 500 (although in the future we may have more specific error message). The more specific error message will be shown in Tomcat's log (catalina.out). Opening a port will differ from OS to OS. Example: CentOS 6 and 7 Make sure that the /data symlink in the web node was created for the LoaderUI to work properly. If not, do the following: If run in the same terminal session you ran gobii_ship*.sh scripts, then run this as is, the $BUNDLE_PARENT_PATH variable should be there. If not, replace it with that parameter's value in the *_main.parameters file that you have. If it is not possible for the /data symlink to be created because in the target host, /data is already a directory or a drive mount point, create a symlink manually from /data/gobii_bundle to point to $BUNDLE_PARENT_PATH/gobii_bundle -effectively still making /data/gobii_bundle point to the correct location. |
This is a script called at the start of the deployment to verify if the system is going to be WIPED of data both the files associated with the DB as well as the database volumes are removed but it also makes sure you verify multiple times! |
Additional ScriptsThis script has been built into the_gobii_ship.sh and it is recommended to have this running, but since its creation the .jar used within the script has been moved from its original location. This jar can be used and processed manually until the jar has been replaced into the gobiideployment repo. The kdc_passwd.sh was built to help update the kdc admin password. The Non-Destructive Deployment architecture must be in place in order to run this script.
This script catches the directory locations and and files existences. Depending on the existence and current link status the script will make gobii-web.xml backups for use later, if needed, and will be sure to not remove and files or directories in persistent_data to prevent data loss. |
LivelinksLivelinks are links sent within the notification emails for loads and extracts that point to the files location within the owncloud file browser When running the following line to enable livelinks, it will need to be run from the /data/gobii_bundle/config directory |
[Deprecated] Restoring backup dataDue to the implementation of Non-Destructive deployment the restore process is no longer needed. This section details the data restoration process used after a backup and deployment has completed. Simply run the restore script with the correct parameters: GOBII Add-on Scripts#onScripts-RestoreDataBundlefromBackup Go to the link above as the syntax in running these scripts have changed slightly from version 1.4. Verify that the data was restored by opening any crop's ExtractorUI. You should see previously loaded datasets.
|
[Deprecated] Configure TimescopeThis section has been deprecated as the process has been built into the deployment scripts. This section will remain on this version deployment documentation for future deployment references. As of version 1.5, we are adding a new web application called "Timescope". This will allow users to browse and delete data permanently from the database. With this, there are additional steps that needs to be done – but only needs to be done once (i.e. if you upgrade to any version >1.5 in the future you won't need to do the following anymore). [Deprecated] Creating Timescope User This process should not be needed as the 'timescoper' user is already built into the deployed DB. This section is being kept for future references. As of version 1.5, we are adding a new web application called "Timescope". This will allow users to browse and delete data permanently from the database. With this, there are additional steps that needs to be done – but only needs to be done once (i.e. if you upgrade to any version >1.5 in the future you won't need to do the following anymore).
Layered System Architecture Created by Jun 26, 2019 Analytics
This architecture stack is for batch operations. Metadata size and genotype can easily get too large for conventional data loading to handle. The main differences of this stack from the "general" architecture are the data access layer and the business layer. The digester serve as the business layer. It will convert whatever input files (raw files like hmp, csv, etc + instruction files from the presentation layer) to a format that the data access layer will understand for loading (IFL). It is also responsible for giving the instructions on what( information-add) to extract and pass (out -)them to the metadata extractor (MDE). The data access layer here is broken into two parts based on functionality. IFL is for batch loading data to the different data stores while MDE is for extracting data in batches and writing them to files. You can also think of IFLs and MDEs as including the functions provided to load and extract the genotype matrix from HDF5/MonetDB. The whole communication line of the digesters and the data access layer is facilitated by cron jobs (as indicated in the gear icons below).
To verify that Timescope is properly deployed: Open your browser and navigate to <web_node_url>:<web_node_port>/timescope. Upon initial install, there will only be one superuser account in your Timescope database. The credentials are in this page: Default Credentials. When you first log in, please change this password using the Timescope UI for security. If you cannot access it, contact either Kevin Palis or Roy Petrie. A few things to take note regarding Timescope:
Timescope VerificationTo verify that Timescope is properly deployed: Open your browser and navigate to <web_node_url>:<web_node_port>/timescope. Upon initial install, there will only be one superuser account in your Timescope database. The credentials are on this page: Default Credentials. When you first log in, please change this password using the Timescope UI for security. If you cannot access it, contact either Kevin Palis or Roy Petrie. A few things to note regarding Timescope:
|
GOBii PortalThis section shows the portal that links all products and features with GDM. Post initial deployment is recommended to back up current launchers.xml file used within web-node and replace after deployment. Multiple crops and Additional links in GOBii portalThe *_main.parameters will need new lines indicating the name of the new crops. These parameters can be anywhere within the *_main.parameters file. Location: xml_config_parser.py Location: launchers.xml Deploying more than one crop or adding links into the portal utilizes /data/gobii_bundle/config/utils/xml_config_parser.py which changes and updates the /usr/local/tomcat/webapps/gobii-portal/config/launchers.xml During deployment, the script configures the original crop for crop 1 noted in the parameters file but additional crops and links will need to be added by adding the following to the script or manually running the following commands. Example below is defaulted in the GOBii web script template for adding portainer to the deployment. If the configurations need to be changed and scripts are erroring you can add the above configurations manually. The webpage will update dynamically. |
ownCloudThis section will show the setup and configuration required post deployment. This section assumes the container was deployed but the LDAP, Storage and Shares have not been configured. After ownCloud deployment login with ownCloud default user and pass. This user and pass will have to be updated by the deploying system administrator as the user and password are stored and salted in the DB. Once logged in, select user name "Admin" > "Settings" > On left panel, under Admin, select "User Authentication". The configuration on "Server " tab will show the configurations made in the *_main.parameters file. If the configurations were correct at deployment at the bottom will show "". If the below screenshot shows "" instead, update the configurations within this tab until it shows OK for your authentication configuration. LDAP CertificatesIf using a certificate the configuration will show "OK" once it’s properly setup but will fail to return any users or groups. Though within the "Login Attributes" tab a username can be verified even without the certificate but this is the extent until the certificate is added to the container. On deployment, the /data directory is mounted to the ownCloud container. Place the certificate anywhere within /data then copy to the /var/www/owncloud directory. The system should pick this up on the next attempt to authenticate. ownCloud Active Directory ConfigurationownCloud works well with LDAP but needs additonal settings configured for systems using Active Directory. Within the "Expert" tab at the right settings for Internal Username and UUID may need to be updated.
Verify under the gear icon the "Enable Preview" and "Enable Sharing" are checked Sharing External Storage with Users
This will allow the GOBii group to see and use the files and directories shared but will be unable to edit or change them.
|
PortainerPortainer is a container that sits on a system and monitors all docker/container information. This system can monitor multiple end points by deploying the sherpa container opening any specific port. This allows for the portainer container to access and monitor all containers on a remote system. Portainer Initial LoginOn the initial login, portainer will request the admin to set up the password. Portainer holds configurations under the /data directory. If the system is removed and redeployed the same configurations will remain as the portainer files within /data are not removed. Select "Local" > Select "Connect" this will allow for local monitoring and allow for adding remote endpoints to be monitored post deployment. Adding Sherpa Agent NodeSelect "Endpoints" in left panel > Select "Add endpoint" Add the configurations for the sherpa node under "Environment Details":
During testing of portainer the latest has problems adding endpoints and would fail to add with a very undescriptive error. This error only occurred when attempting to connect ubuntu 16.04 server VMs together with latest portainer and latest sherpa on both of these versions of portainer/sherpa and Ubuntu 16.04 OS on a VM. This error was not seen between
Deploying Sherpa Agent ContainerSherpa opens the contain port for external access but is limited to the specified networks setup in the parameters: The portainer container will be unable to monitor the remote host unless the communication to specified port is specified. Deploying Sherpa via GOBii scripts
Deploying Sherpa manually
|
Post Deployment Verification [Smoke Testing]This section is large enough it warrants its own document. Please follow the link below to get the version deployment Smoke Testing documentation. |