Introduction
We aim to enable the next-generation cloud-based platform for collaborative research on AWS with access to research tools, data sets, processing pipelines, and analytics workbench in a frictionless manner. It takes less than 30 minutes to launch a “MyResearchCloud” working environment for Principal Investigators and Researchers with security, scalability, and cost governance. Using the Software as a Service (SaaS) model is a preferable option for Scientific research in the cloud with tight control on data security, privacy, and regulatory compliances.
Typical Top 5 Use Cases
- Need an RStudio solution on AWS Cloud with an ability to connect securely (using SSL) without having to worry about managing custom certificates and their lifecycle
- Genomic pipeline processing using Nextflow and Nextflow Tower (open source) solution integrated with AWS Batch for easy deployment of open source pipelines and associated cost tracking per researcher and per pipeline
- Enable researchers with EC2 Linux and Windows servers to install their specific research tools and software. Ability to add AMI based researcher tools (both private and from AWS Marketplace) with 1-click on MyResearchCloud
- Using SageMaker AI/ML Workbench drive Data research (like COVID-19 impact analysis) with available public data sets already on AWS cloud and create study-specific data sets
- Enable a small group of Principal Investigator and researchers to manage Research Grant programs with tight budget control, self-service provisioning, and research data sharing
MyResearchCloud is a solution powered by RLCatalyst Research Gateway product and provides the basic environment with access to data, workspaces, analytics workbench, and cloud pipelines, as explained in the figure below.
Currently, it is not easy for research institutes, their IT staff, and a group of principal investigators & researchers to leverage the cloud easily for their scientific research. While there are constraints with on-premise data centers and these institutions have access to cloud accounts, converting a basic account to one with a secured network, secured access, ability to create & publish product/tools catalog, ingress & egress of data, sharing of analysis, tight budget control and other non-trivial tasks divert the attention away from ‘Science’ to ‘Servers’.
We aim to provide a standard catalog for researchers out-of-the-box solution with an ability to also bring your own catalog, as explained in the figure below.
Based on our discussions with research stakeholders, especially small & medium ones, it was clear that the users want something as easy to consume as other consumer-oriented activities like e-shopping, consumer banking, etc. This led to the simplified process of creating a “MyResearchCloud” with the following basic needs:
- This “MyResearchCloud” is more suitable for smaller research institutions with a single or a few groups of Principal Investigators (PI) driving research with few fellow researchers.
- The model to set up, configure, collaborate, and consume needs to be extremely simple and comes with pre-built templates, tools, and utilities.
- PI’s should have full control of their cloud accounts and cost spends with dynamic visibility and smart alerts.
- At any point, if the PI decides to stop using the solution, there should be no loss to productivity and preservation of existing compute & data.
- It should be easy to invite other users to collaborate while still controlling their access and security.
- Users should not be loaded with technical jargon while ordering simple products for day-to-day research using computation servers, data repositories, analysis IDE tools, and Data processing pipelines.
Based on the above ask, the following simple steps have been enabled:
The picture below shows the easy way to get started with the new Launchpad and 30 minutes countdown.
Architecture Details
To balance the needs of Speed with Compliance, we have designed a unique model to allow Researchers to “Bring your own License” while leveraging the benefits of SaaS in a unique hybrid approach. Our solution provides a “Gateway” model of hub-and-spoke design where we provide and operate the “Hub” while enabling researchers to connect their own AWS Research accounts as a “Spoke”.
Security is a critical part of the SaaS architecture with a hub-and-spoke model where the Research Gateway is hosted in our AWS account using best practices of Cloud Management & Governance controlled by AWS Control Tower while each tenant is created using AWS security best practices of minimum privileges access and role-based access so that no customer-specific keys or data are maintained in the Research Gateway. The architecture and SaaS product are validated as per AWS ISV Path program for Well-Architected principles and data security best practices.
The following diagram explains in more detail the hub-and-spoke design for the Research Gateway.
This de-coupled design makes it easy to use a Shared Gateway while connecting your own AWS Account for consumption with full control and transparency in billing & tracking. For many small and mid-sized research teams, this is the best balance between using a third-party provider-hosted account and having their own end-to-end setup. This structure is also useful for deploying a hosted solution covering multiple group entities (or conglomerates), typically covering a collaborative network of universities working under a central entity (usually funded by government grants) in large-scale genomics grants programs. For customers who have more specific security and regulatory needs, we do allow both the hub-and-spoke deployment accounts to be self-hosted. The flexible architecture can be suitable for different deployment models.
AWS Services that MyResearchCloud uses for each customer:
In our experience, research institutions can enable new groups to use MyResearchCloud with small monthly budgets (starting with US $100 a month) and scale their cloud resources with cost control and optimized spendings.
Summary
With an intent to make Scientific Research in the cloud very easy to access and consume like typical Business to Consumer (B2C) customer experiences, the new “MyResearchCloud” model from Relevance Lab enables this ease of use with the above solution providing flexibility, cost management, and secure collaborations to truly unlock the potential of the cloud. This provides a fully functional workbench for researchers to get started in 30 minutes from a “No-Cloud” to a “Full-Cloud” launch.