Jumpstart Virtual Cloud Labs – Data Analytics Cloud Lab Setup in Minutes with Self-Service Access, Security, and Cost Controls

March 17, 2022

Introduction

Out of the Research Gateway SaaS solution from Relevance Lab provides a next-generation cloud-based platform for collaborative scientific research on AWS with access to research tools, data sets, processing pipelines, and analytics workbenches in a frictionless manner. It takes less than 30 minutes to launch a “MyResearchCloud” working environment for Principal Investigators and Researchers with security, scalability, and cost governance. Using the Software as a Service (SaaS) model is a preferable option for consuming functionality but in the area of scientific research, it is equally critical to have tight control on data security, privacy, and regulatory compliances.

One of the growing needs from customers is to use the solution for their online training needs and specialized use cases on Bioinformatics courses. With the pandemic, there is tremendous new interest in students to pursue life sciences courses and specialize in Bioinformatics streams. At the same time, education institutions are struggling to move their internal Training Labs infrastructure from data centers to the cloud. As an AWS specialized partner for Higher Education, we are working with a number of universities to understand their needs better and provide solutions to address the same in an easy + cost-effective manner.

The Top 5 used cases shared by customers to set up their Virtual Cloud Labs for courses like Bioinformatics

  • Enterprise Needs: Ability to move from Data Center physical labs to cloud-based Virtual Labs using their Corporate Cloud accounts easily without compromising on security, tight cost controls, and a self-service portal for Instructors and Students. Enterprise-grade controls on Budget, Students/Instructors Access, Data Security, and Approved Products Catalog.
  • Business Needs: The setup of a New Virtual Training Lab should support the key learning and research needs of the students.
  • Programs available to provide access to students based on calendar programs for the duration of the full semester.
  • Longer-term projects and programs accessible for labs based on research grants and associated budgets/time constraints.
  • IT Department Needs: From University Corporate IT to be able to allow specific departments (like Bioinformatics) to have their own Programs and Projects with self-service without compromising on Enterprise Security and Compliance Needs.
  • Curriculum Department Needs: Different Department Heads (like Bioinformatics) and Instructors be able to define learning curriculum and associated training programs with access to Classroom and Research Labs. Departments also need tight control on budgets and student access management.
  • Student Needs: The ability for students to access cloud-based Training Labs is a very easy and simple manner without requiring deep access to cloud knowledge. Also having pre-build solutions for basic needs covering Analytics Tools like RStudio/Jupyter, access to secure data repositories, open-source tools/containers access, and collaboration portal.

The following picture describes the basic organization and roles set up in a university.

To balance the needs of speed with compliance, we have designed a unique model to allow Universities to “Bring your own License” while leveraging the benefits of SaaS in a unique hybrid approach. Our solution provides a “Gateway Model” of Hub-n-Spoke design where we provide and operate the “Hub” while enabling universities and their departments to connect their own AWS Research accounts as a “Spoke” and get started within 30 min with full access to a complete Classroom Toolkit. A sample of out-of-the-box Bioinformatics Lab tools available as a standard catalog is shown below.

Professors can add more tools to the standard catalog by importing their own AMIs using the AWS Service Catalog. It is also very simple to create new course material and support additional tools using the base building blocks provided out-of-the-box.

Currently, it is not easy for universities, their IT staff, professors, students, and research groups to leverage the cloud easily for their scientific research. There are constraints with on-premise data centers and these institutions have access to Cloud accounts. However converting a basic account to a secure network, secure access, ability to create & publish product/tools catalog, ingress & egress of data, sharing of analysis, enforce tight budget control are non-trivial tasks that divert attention away from education to infrastructure.

Based on our discussions with stakeholders it was clear that the users want something that is as easy to consume as other consumer-oriented activities like e-shopping, consumer banking, etc. This led to the simplified process of creating a “My -Bioinformatics-Cloud-Lab” with the following basic needs:

1. A university can decide to sign up with Research Gateway (SaaS) to enable their different departments to use this software to enable online training and research needs. Such a university-level adoption is recommended to be an enterprise version of the software (hosted by us or by the university itself) and used for different departments (called Organization or Business Units).
2. Another simpler way is to use our hosted version of Research Gateway by a particular department to create a tenant in Research Gateway with no overheads to maintain a university-specific deployment.
3. A Head of Department (HOD) can sign-up to create a new Tenant on Research Gateway and configure their own AWS Billing account to create Projects. Each Project can then invite other professors to be part of the online Training Labs. Projects can be aligned with semester-based classroom lab needs or can be part of ongoing research projects.  Each project has a budget assigned along with associated professors and students, who have access to the project. The figure below shows typical department projects inside the portal.

4. Once the professor selects the project they can see standard “available products” in the Project. This project is used as a basic setup for a Training Lab. The figure below shows the sample screen for the available set of tools Professors can access by default. They can also add new products to the Lab Catalog.

For every Project (Lab) by default shared infrastructure is made available in the form of Project Storage, where curriculum-related data and information can be stored and made available to all students. Also, necessary security aspects for SSL connection, VPC, IAM roles, etc. are setup by default to make sure the Cloud Training Lab has a well-architected design.

5. A professor can control basic parameters for the Lab in terms of adding/deleting users, managing budgets, and also be able to take actions like “Pausing” a Project (no new products can be created while existing ones can be used) or “Stopping” the project (where all existing running machines are force stopped and no new ones can be created, however, data on the storage is accessible by students). The figure below shows how to manage project-level users and budget controls.

6. A professor can track the consumption of the lab resources by all users including professors and students as shown in the figure below.

7. Once a student logs into the project and accesses the lab resources, they can create their own workspaces like Rstudio and interact with the same from within the Portal. Once they are done with their work, they can stop the machine and log out to ensure no costs are being spent while the systems are not being used. When a researcher or student logs in, they can interact with active products and project storage as shown in the figure below.

8. The students can interact with their tools like RStudio from within the portal and connect to the same in a secure manner with a single click as shown in the figure below.

9. On Clicking the “Open Link” action, it allows access to an R Studio familiar environment for students to log in and learn as per their curriculum needs. The figure below shows the standard R Studio environment.

Summary

The new solution from Relevance Lab makes Scientific Research and Training in the Cloud very easy for use cases like Bioinformatics. It provides flexibility, cost management, and secure collaborations to truly unlock the potential of the Cloud. For Higher Education Universities, this provides a fully functional Training Lab accessible by professors and students in less than 30 minutes.  

Tags
Data & AI
Data Analytics
bioinformatics
cloud