Your address will show here +12 34 56 78
2022 Blog, Blog, Featured, Feature Blog

Modern scientific research depends heavily on processing massive amounts of data which requires elastic, scalable, easy-to-use, and cost-effective computing resources. AWS Cloud provides such resources, but researchers still find it hard to navigate the AWS console. RLCatalyst Research Gateway simplifies access to HPC clusters using a self-service portal that takes care of all the nuts and bolts to provision an elastic cluster based on AWS ParallelCluster 3.0 within minutes. Researchers can leverage this for their scientific computing.

Relevance Lab has been collaborating with AWS Partnership teams over the last one year to simplify access to High Performance Computing across different fields like Genomics Analysis, Computational Fluid Dynamics, Molecular Biology, Earth Sciences, etc.

There is a growing need from customers to adopt the High Performance Computing capabilities in the public cloud. However this throws in key challenges related to right architecture, workload migration and cost management. Working closely with AWS HPC groups we have been enabling adoption of AWS HPC solutions with early adopters in Genomics and Fluid Dynamics with Higher Education and Healthcare customers. The primary ask is for a self-service Portal for planning, deploying and managing HPC workloads with security, cost management and automation. The figure below shows the key building blocks of HPC Architecture part of our solution.


AWS ParallelCluster 3.0
AWS ParallelCluster is an open source cluster management tool written using Python and is available via the standard python package index (PyPI). Version 3.0 also provides support for APIs and Research Gateway leverages this to integrate with the AWS Cloud to set up and use the HPC cluster for complex computational tasks. AWS ParallelCluster supports two different orchestrators, AWS Batch and Slurm, which cover a vast majority of the requirements in the field. ParallelCluster brings many benefits including easy scalability, manageability of clusters, and seamless migration to the cloud from on-premise HPC workloads.

FSx for Lustre
Amazon FSx for Lustre provides fully managed shared storage with the scalability and performance of the popular Lustre file system. This storage can be accessed with very low (sub-millisecond) latencies by the worker nodes in the HPC cluster and provides very high throughput.

NICE DCV
NICE DCV is a high performance remote display protocol used to deliver remote desktops and application streaming from resources in the cloud to any device. Users can leverage this for their visualization requirements.

Research Gateway Provides a Self-Service Portal for AWS PCluster 3.0 Launch with Automatic Cost Tracking
Using RLCatalyst Research Gateway, research teams are organized into projects with their own catalog of self-service workspaces that researchers can provision easily with minimum knowledge of AWS cloud setup. The standard catalog, included with RLCatalyst Research Gateway, now has a new item called PCluster which a Principal Investigator can add to the project catalog to make it available to their team. This product is based on AWS ParallelCluster 3.0 which is a command line tool that advanced users can work with. Research Gateway has wrapped this tool with an intuitive user interface.

To see how you can set up an HPC cluster within minutes, check this video.

The figure below shows a standard catalog inside Research Gateway for users to provision PCluster and FSx for Lustre with ease.


Setting Up a Shared Cluster for Use in the Project
The PCluster product on Research Gateway offers a lot of flexibility. While researchers can set up and use their own clusters, sometimes there is a need to use a shared cluster across collaborators within the same project. Towards this goal, we have also brought in a feature that allows a user to “share” the cluster with the entire project team. The other users can then connect to the same cluster and submit jobs. For example a Principal Investigator might set up the cluster and share it with the researchers in the project to use for their computations.


Large Datasets Storage and Access to Open Datasets
AWS cloud is leveraged to deal with the needs of large datasets for storage, processing, and analytics using the following key products.

Amazon S3 for high-throughput data ingestion, cost-effective storage options, secure access, and efficient searching.

AWS Datasync for secure, online service that automates and accelerates moving data between on-premises and AWS storage services.

AWS Open Datasets program houses openly available, with 200+ open data repositories.

Cost Analysis of Jobs
Research Gateway injects cost allocation tags into the ParallelCluster so that all resources created are tagged and the cost of the scalable cluster can easily be monitored from the Research Gateway UI.


Summary
AWS Cloud provides services like AWS ParallelCluster and FSx for Lustre that can help users with High Performance Computing for their scientific computing needs. Research Gateway makes it easy to provision these services with a 1-Click, self-service model and provides cost and governance to help manage your budget.

To know more about how you can start your HPC needs in the AWS cloud in 30 minutes using our solution at https://research.rlcatalyst.com, feel free to contact marketing@relevancelab.com

References
Build Your Own Supercomputers in AWS Cloud with Ease – Research Gateway Allows Cost, Governance and Self-service with HPC and Quantum Computing
Leveraging AWS HPC for Accelerating Scientific Research on Cloud
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution



0

2022 Blog, Blog, Featured, Feature Blog

As digital adoption grows, so do user expectations for always-on and reliable business services. Any downtime or service degradation can have serious impacts on the reputation of the company and its business with brutal reviews and poor customer satisfaction. The classic pursuit of DevOps helps businesses deliver new digital experiences faster, while Site Reliability Engineering (SRE) ensures the promises of better services actually stay consistent beyond the launch. Relevance Lab is helping customers take DevOps maturity to the next level with successful SRE adoptions and on the path to AIOps implementations.

Relevance Lab has been working with 50+ customers on the adoption of cloud, DevOps, and automation over the last decade. In the last few years, the interest in AIOps and SRE has grown, especially among large and complex hybrid enterprises. These companies have speeded up the journey to cloud adoption and techniques of DevOps + Automation across the products lifecycle. At the same time, there is confusion among these enterprises regarding taking their maturity to the next level of AIOps adoption with SRE.

Working closely with our existing customers and with best practices built as part of the journey, we present a framework for SRE adoption of large and complex enterprises by leveraging a unique approach from Relevance Lab. It’s built on the concept of RLCatalyst as a platform for SRE adoption that is faster, cheaper, and more consistent.

The basic questions we have heard from our customers looking at adopting SRE are the following:

  • We want to adopt the SRE best practices similar to Google, but our context of business, applications, and infrastructure is very diverse and needs to consider the legacy systems.
  • A number of applications in our organization are business applications that are very different from digital applications but need to be part of the overall SRE maturity.
  • The cloud adoption for our enterprise is a multi-year program, so we need a model that helps adopt SRE in an iterative manner.
  • The CIO landscape for global enterprises covers different continents, regions, business units (BU), countries, and products in a diverse play covering 200+ applications, and SRE needs to be a framework that is prescriptive but flexible for adoption.
  • The organizational structure for large enterprises is complex, with different specialized teams and specialist vendors helping manage operations across Infrastructure, Applications Support, and Service Delivery that was built for an era of on-premise systems but is not agile.
  • Different groups have tried a movement toward SRE adoption but lack a consistent blueprint and partner who can advise, build, and transform.
  • The reflection of lack of SRE presents on a daily basis with a long time for critical incident handling, issues tossing between groups, repetitive poor outcomes, and excessing focus on process compliance without end-user impacts.

The basic concepts covered in the blog are the following and are intended to act as a handbook for new enterprises in the adoption of the SRE framework:

  1. What is the definition and the scope of SRE for an enterprise?
  2. Is there a framework that can be used to adopt SRE for a large and complex hybrid enterprise?
  3. How can Relevance Lab help in the adoption and maturity of SRE for a new enterprise?
  4. What is unique about Relevance Lab solutions leveraging a combination of Platform + Services?

What is SRE?
SRE refers to Site Reliability Engineering. It is responsible for all the critical business services. It ensures that end customers can rely on IT for their mission-critical business services. Site Reliability Engineers (SREs) ensure the availability of these services, building the tools and automation to monitor and enable this availability. A successful SRE implementation also requires the right organizational structure along with tools and technologies.

SRE Building Blocks/Hierarchy of Reliability
Relevance Lab’s SRE Framework consists of 5 building blocks, as shown in the following image.


As shown above, the SRE building block consists of an Initial Assessment, Monitoring and Alerting Optimization, Incident handling with self-heal capability, Incident Prevention, and an end-to-end SRE dashboard.

RL SRE Framework
Relevance Lab’s SRE framework provides a unique approach of Platform + Competencies for multiple global enterprises. RL’s SRE adoption presents a unique way of solving the problems related to critical business applications availability, performance, and capacity optimization. The primary focus is on ensuring critical business services are available while all issues are proactively addressed. SRE also needs to ensure an automation-led operations model delivers better performance, quality, and reliability.


Our methodology for SRE Implementation consists of the following:


  • The initial step for any application group or family is to understand the current state of maturity. This is done by assessment checklist, and the outcome of this would decide if the application would qualify for SRE implementation. In case the application doesn’t qualify for SRE implementation, the next step would be to fix the basic requirements that need to be in place for effective SRE implementation. The same would be reassessed post putting the basic check in place.
  • Based on the assessment activity and the gaps identified, we will recommend the steps that need to be in place for an effective SRE model. The outcome of the assessment would translate into an Implementation Plan. Below are the 5 Steps required to implement SRE for an Organization:
    • Level 1: Monitoring – Focuses on 4 Golden Signals, Service Level Agreements (SLA) and Service Level Objectives (SLO)/Service Level Indicator (SLI), and Error Budgets
    • Level 2: Incident Response – Alert Management, On-Call Management, RACI, Escalation Chart, Operations Handbook.
    • Level 3: Post-Incident Review – Postmortem of Incidents, Prevention based on Root Cause Analysis
    • Level 4: Release and Capacity Management – Version Control System, Deployment using CI/CD, QA/Prod Environments, Pressure Test
    • Level 5: Reliability Platform Engineering – end-to-end SRE dashboard

Our Uniqueness
Relevance Lab’s SRE framework for any cloud or hybrid organization goes through Enablement Phase and Maturity Phase. In each phase, there are platform-related activities and application-related activities. Every application goes through a Phase 1 Enablement journey to reach stabilization and then move towards Phase 2 Maturity.

Phase 1 – Enablement is a basic SRE model that helps enterprise reach a basic level of SRE implementation, and this covers the first 3 stages of the Relevance Lab SRE Framework. This will include the implementation of new tools, processes, and platforms. At the end of this phase, a clear definition of the golden signals, SLI/SLOs against SLA, and Error Budgets are defined, monitored, and tracked. The refined runbooks and operating guides help in the proactive identification of Incidents and faster recovery due to on-call management. Activities like Post Incident Review, Pressure Tests, Load testing, etc., help stabilize the application and the infrastructure. As part of this phase, an SRE 1.0 dashboard is available as an output to monitor the SRE metrics.

Phase 2 – Maturity is an advanced SRE model which covers the last two stages of the Relevance Lab SRE Framework. It emphasizes on automation-first approach for an end-to-end lifecycle management, and includes advanced release management, auto-remediations for Incident management, security, and capacity management. This will be an ongoing maturity phase to bring in additional applications and BUs under the scope of the SRE model. The output of this phase will be an automated SRE 2.0 dashboard, which will have intelligence-based actionable insights & prevention.

Summary
Relevance Lab (RL) has worked with multiple large companies on the “Right Way” to adopt Cloud and SRE maturity models. We realize that each large enterprise has a different context-culture-constraint model covering organization structures, team skills/maturity, technology, and processes. Hence the right model for any organization will have to be created as a collaborative model, where RL will act as an advisor to Plan, Build and Run the SRE model based on the framework (RLCatalyst) they have created.

For more information on RL’s SRE framework and maturity model or for its implementation, feel free to contact marketing@relevancelab.com



0

2022 Blog, Blog, Featured, Feature Blog

Our goal, at Relevance Lab (RL), is to make scientific research in the cloud ridiculously simple for researchers and principal investigators. Cloud is driving major advancements in both Healthcare and Higher Education sectors. Rapidly being adopted by various organizations across these sectors in both commercial and public sector segments, research on the cloud is improving day-to-day lives with drug discoveries, healthcare breakthroughs, innovation of sustainable solutions, development of smart and safe cities, etc.

Powering these innovations, AWS cloud provides an infrastructure with more accessible and useful research-specific products that speed time to insights. Customers get more secure and frictionless collaboration capabilities across large datasets. However, setting up and getting started with complex research workloads can be time-taking. Researchers often look for simple and efficient ways to run their workloads.

RL addresses this issue with Research Gateway, a self-service cloud portal that allows customers to run secure and scalable research on the AWS cloud without any heavy-lifting of set-ups. In this blog, we will explore different use cases that simplify their workloads and accelerate their outcomes with Research Gateway. We will also elaborate on two specific use cases from the healthcare and higher education sector for the adoption of Research Gateway Software as a Service (SaaS) model.

Who Needs Scientific Research in the Cloud?
The entire scientific community is trying to speed up research for better human lives. While scientists want to focus on “science” and not “infrastructure”, it is not always easy to have a collaborative, secure, self-service, cost-effective, and on-demand research environment. While most customers have traditionally used on-premise infrastructure for research, there is always a key constraint on scaling up with limited resources. Following are some common challenges we have heard our customers say:


  • We have tremendous growth of data for research and are not able to manage with existing on-premise storage.
  • Our ability to start new research programs despite securing grants is severely limited by a lack of scale with existing setups.
  • We have tried the cloud but especially with High Performance Computing (HPC) systems are not confident about total spends and budget controls to adopt the cloud.
  • We have ordered additional servers, but for months, we have been waiting for the hardware to be delivered.
  • We can easily try new cloud accounts but bringing together Large Datasets, Big Compute, Analytics Tools, and Orchestration workflows is a complex effort.
  • We have built on-premise systems for research with Slurm, Singularity Containers, Cromwell/Netflow, custom pipelines and do not have the bandwidth to migrate to the cloud with updated tools and architecture.
  • We want to provide researchers the ability to have their ephemeral research tools and environments with budget controls but do not know how to leverage the cloud.
  • We are scaling up online classrooms and training labs for a large set of students but do not know how to build secure and cost-effective self-service environments like on-premise training labs.
  • We are requiring a data portal for sharing research data across multiple institutions with the right governance and controls on the cloud.
  • We need an ability to run Genomics Secondary Analysis for multiple domains like Bacterial research and Precision Medicines at scale with cost-effective per sample runs without worrying about tools, infrastructure, software, and ongoing support.

Keeping the above common needs in perspective, Research Gateway is solving the problems for the following key customer segments:


  • Education Universities
  • Healthcare Providers
    • Hospitals and Academic Medical Centers for Genomics Research
  • Drug Discovery Companies
  • Not-for-Profit Companies
    • Primarily across health, education, and policy research
  • Public Sector Companies
    • Looking into Food Safety, National Supercomputing centers, etc.

The primary solutions these customers are seeking from Research Gateway have been mentioned below:

  1. Analytics Workbench with tools like RStudio and Sagemaker
  2. Bioinformatics Containers and Tools from the standard catalog and bring your own tools
  3. Genomics Secondary Analysis in Cloud with 1-Click models using open source orchestration engines like Nextflow, Cromwell and specialized tools like DRAGEN, Parabricks, and Sentieon
  4. Virtual Training Labs in Cloud
  5. High Performance Computing Infrastructure with specialized tools and large datasets
  6. Research and Collaboration Portal
  7. Training and Learning Quantum Computing

The figure below shows the customer segments and their top use cases.

How Research Gateway is Powering Frictionless Outcomes?
Research Gateway allows researchers to conduct just-in-time research with 1-click access to research-specific products, provision pipelines in a few steps, and take control of the budget. This helps in the acceleration of discoveries and enables a modern study environment with projects and virtual classrooms.

Case Study 1: Accelerating Virtual Cloud Labs for the Bioinformatics Department of Singapore-based Higher Education University
During interaction with the university, the following needs were highlighted to the RL team by the university’s bioinformatics department:

Classroom Needs: Primary use case to enable Student Classrooms and Groups for learning Analytics, Genomics Workloads, and Docker-based tools

Research Needs: Used by a small group of researchers pursuing higher degrees in Bioinformatics space

Addressing the Virtual Classroom and Research Needs with Research Gateway
The SaaS model of Research Gateway is used with a hub-and-spoke architecture that allows customers to configure their own AWS accounts for projects to control users, products, and budgets seamlessly.

The primary solution includes:


  • Professors set up classrooms and assign students for projects based on semester needs
  • Usage of basic tools like RStudio, EC2 with Docker, MySQL, Sagemaker
  • Special ask of forwarding and connecting port to shared data on cloud-based for local RStudio IDE was also successfully put to use
  • End-of-day automated reports to students and professors on server “still running” for cost optimization
  • Ability to create multiple projects in a single AWS Account + Region for flexibility
  • Ability to assign and enforce student-level budget controls to avoid overspending

Case Study 2: Driving Genomics Processing for Cancer Research of an Australian Academic Medical Center
While the existing research infrastructure is for on-premise setup due to security and privacy needs, the team is facing serious challenges with growing data and the influx of new genomics samples to be processed at scale. A team of researchers is taking the lead in evaluating AWS Cloud to solve the issues related to scale and drive faster research in the cloud with in-build security and governance guardrails.

Addressing Genomic Research Cloud Needs with Research Gateway
RL addressed the genomics workload migration needs of the hospital with the Research Gateway SaaS model using the hub-and-spoke architecture that allows the customer to have exclusive access to their data and research infrastructure by bringing their one AWS account. Also, the deployment of the software is in the Sydney region, complying with in-country data norms as per governance standards. Users can easily configure AWS accounts for genomics workload projects. They also get access to genomic research-related products in 1-click along with seamless budget tracking and pausing.

The following primary solution patterns were delivered:


  • Migration of existing HPC system using Slurm Workload Manager and Singularity Containers
  • Using Cromwell for Large Scale Genomic Samples Processing
  • Using complex pipelines with a mix of custom and public WDL pipelines like RNA-Seq
  • Large Sample and Reference Datasets
  • AWS Batch HPC leveraged for cost-effective and scalable computing
  • Specific Data and Security needs met with country-level data safeguards & compliance
  • Large set of custom tools and packages

The workload currently operates in an HPC environment on-premise, using slurm as the orchestrator and singularity containers. This involves converting singularity containers to docker containers so that they can be used with AWS Batch. The pipelines are based on Cromwell, which is one of the leading workflow orchestrator software available from the Broad Institute. The following picture shows the existing on-premise system and contrasts that with the target cloud-based system.


Conclusion
Relevance Lab, in partnership with AWS cloud, is driving frictionless outcomes by enabling secure and scalable research leveraging Research Gateway for various use-cases. By simplifying the setting up and running research workloads in a seamless manner in just 30 minutes with self-service access and cost control, the solution enables the creation of internal virtual labs and acceleration of complex genomic workloads.

To know more about virtual Cloud Analytics training labs and launching Genomics Research in less than 30 minutes explore the solution at https://research.rlcatalyst.com or feel free to write to marketing@relevancelab.com

References
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution
Enabling Frictionless Scientific Research in the Cloud with a 30 Minutes Countdown Now!



0

2022 Blog, Blog, Featured

While there are a lot of talks about Digital Innovation leveraging the cloud, another key disruption in the industry is Applied Science Innovation, led by Scientists and Engineers targeting a broad range of disciplines in Engineering and Medicine. Relevance Lab is proud to now ease the leverage of power tools like High-Performance Computing (HPC) and Quantum Computing on AWS Cloud for such pursuits with our Research Gateway product.

What is Applied Science?
Applied Science uses existing scientific knowledge to solve day-to-day problems in areas like Health Care, Space, Environment, Transportation, etc. It leverages the power of new technologies such as Big Compute and Cloud to drive faster scientific research. Innovation in Applied Science has some unique differences compared to Digital Innovation:


  • Users of Applied Science are researchers, scientists, and engineers
  • Workloads for Applied Science are driven by more specialized systems and domain-specific algorithms & orchestration needs
  • Very large domain-specific data sets and collaboration with a large ecosystem of global communities is a key enabler with a focus on open-source and knowledge sharing
  • Use of specialized hardware and software is also a key enabler

The term Big Compute is used to describe large-scale workloads that require multiple cores (with specialized CPU and GPU types) working with very high-speed network and storage architectures. Such Big Compute architectures solve the problems in image processing, fluid dynamics, financial risk modeling, oil exploration, drug design, etc.

Relevance Lab is working closely with AWS in pursuing specialized use cases for Applied Science and Scientific Research using Cloud. A number of government, public and private sector organizations are focussing large amounts of investment and scientific knowledge on driving innovation in these areas. A few specialized ones with well-known programs are listed below.


What is High Performance Computing?
Supercomputers of the past were very specialized and high-cost systems that could only be built and afforded by large and well-funded institutions. Cloud computing is driving the democratization of supercomputers by providing High Performance Computing (HPC) systems that have specialized architectures. It combines the power of on-demand computing with large & specialized CPU/GPU types, high-speed networking, fast access storage, and associated tools & utilities for workload orchestration and management. The figure below shows the key building blocks of HPC components of AWS Cloud.


What is Quantum Computing?
Quantum computing relies upon quantum theory, which deals with physical phenomena at the nano-scale. One of the most important aspects of quantum computing is the quantum bit (Qubit), a unit of quantum information that exists in two states (horizontal and vertical polarization) at the same time, thanks to the superposition principle of quantum physics.

The Amazon Braket quantum computing service helps researchers and developers use quantum computers and simulators to build quantum algorithms on AWS.


Key Use Cases:

  • Research quantum computing algorithms
  • Test different quantum hardware
  • Build quantum software faster
  • Explore industry applications

What Do Customers Want?
The availability of specialized services like HPC and Quantum Computing has made it extremely simple for customers to be able to consume these advanced technologies and build their own supercomputers. However, when it comes to the adoption cycle, customers are hesitant to adopt the same due to key concerns and asks, as summarized below:

Operational Asks:

  • The top challenge and fear on the cloud is the variable cost model, which can throw a big surprise, and customers want strong Cost Management & Tracking with auto-limits control
  • Security and data governance are also key priorities
  • Data transfer and management are the other key needs

Functional Asks:
  • Faster and easier design, provisioning, and development cycles
  • Integrated and automated tools for deployment and monitoring
  • Easy access to data and the ability to do self-service
  • Derive increased business value from Data Analytics and Machine Learning

How Does Research Gateway Solve Customer Needs?
AWS cloud offerings provide a strong platform for HPC and quantum computing requirements. However, enabling Scientific Research and Training of Researchers requires an ability to offer these with a Self-Service Portal that encapsulates the underlying complexity. On top of proper cost tracking and controlling, security, data management, and an integrated workbench are needed for a collaborative research environment.

To address the above needs, Relevance Lab has developed Research Gateway. It helps scientists accelerate their research on the AWS cloud with access to research tools, data sets, processing pipelines, and analytics workbenches in a frictionless manner. The solution also addresses the need for tight control on a budget, data security, privacy, and regulatory compliances, which it meets while significantly simplifying the process of running complex scientific research workloads.

Research Gateway meets the following key dimensions of collaborative and secure scientific research:

  • Cost and Budget Governance: The solution offers easy control over Cost Tracking of Research Cloud resources to track, analyze, control, and optimize budget spending. Principal Investigators can also pause or stop the budget if it exceeds the set threshold.
  • Research Data & Tools for Easy Collaboration: Research Gateway provides the team of researchers real-time view of research-specific product catalog, cost, and governance, reducing the complexities of running scientific research on the cloud.
  • Security and Compliance: Principal investigators have a unified view and control over security and compliance, covering Identity management, data privacy, audit trails, encryption, and access management.

Principal investigators leading the research get a quick insight into the total budget, consumed budget, and available budget, along with the available research-specific products, as shown in the image below.

With Research Gateway, researchers can provision available research-specific products for their high-performance and quantum computing needs in just 1-click, launching scientific research as quickly as 30 minutes or less.


Summary
High Performance Computing and Quantum computing are essential to the advancement of science and engineering now more than ever. Research Gateway provides fundamental building blocks for Applied Science and Scientific Research in the AWS cloud by simplifying the availability of HPC and Quantum computing for customers. The solution helps create democratized supercomputers on-demand while eliminating the pain of managing infrastructure, data, security, and costs, enabling researchers to focus on science.

To know more about how you can high-performance and quantum computing with just 1-click and launch your research in 30 minutes using our solution at https://research.rlcatalyst.com, feel free to contact marketing@relevancelab.com

References
High-performance genetic datastore on AWS S3 using Parquet and Arrow
Parallelizing Genome Variant Analysis
Leveraging AWS HPC for Accelerating Scientific Research on Cloud
Genomics Cloud on AWS with RLCatalyst Research Gateway
Enabling Frictionless Scientific Research in the Cloud with a 30 Minutes Countdown Now!
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution



0

2022 Blog, Cloud Blog, Blog, Featured

Cloud is no longer a “good-to-have” technology but rather a must-have for enterprises. Although cloud-led digital transformation has been a buzzword for years, enterprises had their own pace of cloud adoption. However, the pandemic necessitated the acceleration of cloud adoption. Enterprises are faced with a new normal of operation that requires the speed and agility of the cloud.

In this blog, we will discuss the ground realities and challenges. We will also explore how Relevance Lab (RL) offers the right mix of experience and proven approaches to grow in today’s hyper-agile industry environment.

A Changed Ground Reality
Pandemic has accelerated how organizations look at IT infrastructure spending. It has also permanently changed their cloud strategies & spending habits. Online reports suggest that 38% more companies took a cloud-first approach compared to 2020 with an increased focus on IaaS and PaaS-based approaches.

According to a Gartner online survey, enterprises have preponed their cloud adoption by several years and this is expected to continue in the near future. The survey also predicts that enterprises will spend more on a just-in-time, value-based adoption to match the demands of a hyper-competitive environment.

Migration and modernization with the cloud is a long-term trend, especially for enterprises with a need to scale up. As CAPEX takes a back seat, OPEX is now at the forefront. Cloud as an industry has matured and evolved over a period of time, enabling faster and better adoption with hyper accelerator tools.

Criteria for the Successful Cloud Journey
The success of an enterprise’s cloud adoption journey can be evaluated by setting and measuring against the right KPIs. A successful cloud journey would help an enterprise achieve “business as usual” along with enhanced business outcomes and customer experience. It standardizes the framework for maintainability and traceability, improves security, and optimizes the cost of ownership, as shown in the image below.


Common Cloud Migration Challenges
Planning for and meeting all the criteria of a successful cloud journey has always been an uphill task. Some of the common challenges are:

Large Datasets: Businesses today are dealing with larger and more unstructured datasets than ever before.

Selection of Right Migration Model: Many enterprises, starting their cloud journey, have to choose the right migration model for adoption as per their needs, such as legacy re-write, lift & shift, and everything in between. The decision is based on various different factors like cost, business outlook, etc, and can impact business performance and operations in the longer run.

Change Management for Adopting a New Way of Operation: Cloud migration requires businesses to expand their knowledge at a rapid rate along with real-time analytics & personalization.

Security Framework: The risk of hackers & security attacks is growing across most industries. To keep up with the security while successfully moving to the cloud, enterprises need robust planning and an action list. Also, enterprises must choose a security framework depending on their size, industry, compliance, and governance needs.

Lack of Proper Planning: Rushed application assessments give rise to a lot of gaps that can affect the cloud environment. As a move into the cloud impacts different verticals and businesses as a whole, all stakeholders must be on the same page when it comes to an adoption plan.

Profound Knowledge: Cloud migration requires a dedicated and experienced team to troubleshoot any problems. While building an in-house team is a time-consuming, costly and tumultuous task, working with partners with knowledge branching into different technologies may not be a beneficial idea as well. Enterprises may need a partner with a focused understanding of the cloud migration niche as they will have assimilated knowledge from their engagement with various customers.

Continuous Effort: Cloud is ever-changing with new developments and evolving paradigms. Thus, cloud migration is not a one-time task but rather requires continuous effort to automate and innovate accordingly.

Solutions to Cloud Migration Challenges
Some of the potential solutions that an enterprise can adopt to overcome common challenges of cloud migration are:

  • Reassessing cloud business & IT plans
  • Identify and remediate risks and gaps in data, compliance, and tech stack
  • Detailed migration approaches with self-sufficient virtual ecosystems
  • Helps build, deliver and fail fast
  • Data-driven analysis enables stakeholders to make quick and effective decisions

Planning the solutions requires extensive experience and knowledge to implement. They can reap the benefits of the cloud easily with the combination of the right approach and solution.

How Relevance Lab Helps Businesses Accelerate their Cloud Journey
Relevance Lab (RL) is a specialist company in helping customers adopt cloud “The Right Way”. It covers the full lifecycle of migration, governance, security, monitoring, ITSM integration, app modernization, and DevOps maturity. We leverage a combination of services and products for cloud adoption. Helping customers on a “Plan-Build-Run” transformation, we drive greater velocity of product innovation, global deployment scale, and cost optimization.

Building Mature Cloud Journey
Moving to the cloud opens up numerous opportunities for enterprises. To reap all the benefits of cloud migration, enterprises need a comprehensive strategy focused on building value, speed, resilience, scalability, and agility to optimize business outcomes. Having worked with businesses across the globe for over a decade, our teams have seen a common trend that enterprises are often unaware of unprecedented adoption challenges, the “day-after” surprises and complexities, or the chronology of their occurrence.

This begs the question – how enterprises can overcome such surprises? Relevance Lab helps you answer it with a comprehensive and integrated approach. Combining cloud expertise and experience, we help enterprises overcome any challenge or surprise coming their way. Meeting the current needs of the clients, we help you build a cohesive and well-structured journey. Here are a few ways Relevance Lab helps you achieve it:

1. Assess the Current State & Maturity of the Cloud Journey
Any enterprise must get a clear picture of its current state before they build a cloud strategy. At Relevance Lab, we help clients assess their structures and requirements to identify their current stage on the cloud maturity journey. The cloud maturity model has 5 stages, namely, Legacy, Cloud Ready, Cloud Friendly, Cloud Resilient, and Cloud Native, as shown in the image below. This helps us to adopt the right approach that matches the exact needs of our clients.


Once the current stage is determined after an assessment, RL helps in designing an effective cloud strategy with a comprehensive and integrated approach keeping a balance between cloud adoption and application modernization. We ensure that all elements of cloud adoption move together, i.e, cloud engineering, cloud security, cloud governance & operating model, application strategy, engineering & DevSecOps, and Cloud Architecture, as shown in the image below.


2. Execute & Deliver through a Cross-Functional Collaboration and Gating Process
After the approach is defined and the strategy is designed, workstreams that integrate people, tools, and processes are identified. Cloud adoption excellence is delivered through cross-functional collaboration and gating across workstreams and stages, as shown in the image below.


How We Helped a Publishing Major Migrate “The Right Way”
Let’s explore a detailed account of how we implemented them for a global publishing major to maximize cloud benefits.

The publishing major was heavily reliant on complex legacy applications and outdated tech stack resulting in security & legal liabilities. There was a pressing need to scale IT and Product engineering to meet market demands driven by usage uptick (triggered by pandemic). Another immediate requirement was the need for better data gathering & analytics to enable faster decision making.

Relevance Lab provided an enterprise cloud migration solution with a data-driven plan and collaboration with business stakeholders. A comprehensive framework prioritizing customer-centric applications for scale and security was put in place. RL helped in implementing an integrated approach leveraging cloud-first and secure engineering & deployment practices along with automation to accelerate development, deployment, testing & operations.


To further learn about the details of how RL helped the above global publishing giant, download our case study.

Conclusion
Given the current times, cloud adoption strategy requires a data-backed understanding of the current systems and logical next steps, ensuring business runs as usual. There are many challenges that an enterprise may face throughout its cloud journey. Most of these may come as surprise as teams often are unaware of the chronological order in which the complexities occur.

Relevance Lab, an AWS partner, has an integrated approach and offerings developed through years of experience in delivering successful cloud journeys to clients across all industries and regions. Like the global publishing major discussed in the blog, we have helped clients significantly reduce costs by implementing modernizations backstage parallelly while their businesses run as usual.

To know more about cloud migration or implement the same for your enterprise, feel free to connect with marketing@relevancelab.com

References:
Cloud Management, Automation, DevOps and AIOps – Key Offerings from Relevance Lab
Relevance Lab Playbooks for Frictionless IT and Business Operations
Leveraging Technology + Consulting Specialization for Products and Solutions



0

2022 Blog, Blog, Featured

With the growing demand for moving to the cloud, organizations also face various challenges, such as the ability to track costs, health, security, and assets at application levels. Having this ability can help organizations get a clear picture of their business metrics (revenue, transaction costs, customer-specific costs, etc.). Some of the other challenges that they face are as follows:

  • No clear definition of what is a service or application. The concept keeps changing from customer to customer based on the business’s criticality and need.
  • Separation of business applications from Internal applications or software services.
  • Deployment of applications across accounts and regions makes consolidation harder.
  • Dependent services and microservice concepts complicate the discovery process.
  • Complex setup involving clustered and containerized deployments promoting service-oriented architecture.
  • What is the target business/efficiency goal? Is it tracking cost, better diagnostics, or CMDB? What is linking to business Unit or Application level spend tracking?

Modeling a Common Customer Use Case


A typical large enterprise goes through a maturity journey from a scattered Infrastructure Asset Management to a more matured Application Asset Management.

Need for Automated Application Service Mapping
Applications are common focal points related to business units and business services that are highlighted by the customers.

  • It is important to track the cost and expenses at the application level for chargebacks. This requires an asset and cost-driven architecture. There is no common way to automate the discovery of such applications unless defined by Customers and linked to their infrastructure.
  • Business endpoint applications are served as a combination of assets and services
  • Knowing such dynamic topology can help with better monitoring, diagnostics, and capacity planning
  • There is a way to discover the infrastructure linked to templates and a service registry, but no easy way to roll that to an application linking

RLCatalyst AppInsights Solution
RLCatalyst AppInsights helps enterprises understand their current state of maturity by defining the global application master and linkage to business units. This is done using the discovery process to link applications, assets, and costs as a one-time activity. In this process, assets are categorized into two categories – allocated or mapped assets (i.e., assets linked to templates) and unallocated assets (i.e., assets that are not linked to any templates).


As shown in the above picture of the discovery process, all assets across your AWS accounts are brought into ServiceNow asset tables using Service Management Connector. Once done using RLCatalyst AppInsights, all assets are demarcated with assets linked to templates and the ones that do not have templates (unallocated assets). At this stage, we have cost allocations across assets linked to templates and unallocated assets. The next step is linking the templates to applications creating a mapping between applications and business units.

Similarly, for all the unallocated assets, we can look at either linking them to newly created templates or linking them to a project and terminating/cleaning up the same. Once you have all this in place, all the data would automatically build your dashboard in terms of cost by applications, Projects, BU, and unallocated costs.

For any new application deployment and infrastructure setup, it would follow the standard process to ensure assets are provisioned through templates, and appropriate taggings are enabled. This is enforced using guardrails for ongoing operations.


As shown above, the plan is to have an updated version of AppInsights V2.0 on ServiceNow store by the end of 2022, which will include the following additional features.

  • Automated Application Service Discovery (AASD)
  • Cross account Applications Cost tracking
  • Support for Non-CFT based applications like Terraform
  • Security and Compliance scores at an account level
  • Support for AppRegistry 2.0

AWS Standard Products and Offerings in This Segment
AWS provides some key products and building blocks that are leveraged in the AppInsights solution.


Summary
Managing your cloud with an Application-Centric Lens can provide effective data analysis, insights, and controls that better align with how large enterprises track their business and Key Performance Indicators (KPIs). Traditionally, the cloud has provided a very Infrastructure-centric and fragmented view that does not allow for actionable insights. This problem is now solved by Relevance Lab AppInsights 2.0.

To learn more about building cloud maturity through an Application-centric view or want to get started with RLCatalyst AppInsights, feel free to contact marketing@relevancelab.com

References
Governance 360 – Are you using your AWS Cloud “The Right Way”
ServiceNow CMDB
Increase application visibility and governance using AWS Service Catalog AppRegistry
AWS Security Governance for Enterprises “The Right Way”
Configuration Management in Cloud Environments



0

2022 Blog, DevOps Blog, Research Gateway, Blog, Featured

Automated deployment of software makes the process faster, easier, repeatable, and more supportable. A variety of technologies are available for deployment, but you need not necessarily choose a complex automation approach to reap the benefits. In this blog, we will cover how Relevance Lab approached using automation for the deployment of their RLCatalyst Research Gateway solution.

RLCatalyst Research Gateway solution from Relevance Lab provides a next-generation cloud-based platform for collaborative scientific research on AWS with access to research tools, data sets, processing pipelines, and analytics workbenches in a frictionless manner. The solution can be used in the Software as a Service (SaaS) mode, or it can be deployed in customers’ accounts in the enterprise mode. It takes less than 30 minutes to launch a working environment for Principal Investigators and Researchers with security, scalability, and cost governance.


During the deployment of this solution, several AWS resources are created:

  • Networking (VPC, Public and private subnets, Internet and NAT Gateways, ALB)
  • Security (Security Groups, Cognito Userpool for authentication, Identity and Acess Management (IAM) Roles and Policies)
  • Database (AWS DocumentDB cluster)
  • EC2 Compute
  • EC2 Image Builder pipelines
  • S3 Buckets (storage)
  • AWS Service Catalog products and portfolios

When such a variety of resources are to be created, there are several benefits of automating the deployment.

  • Faster Deployment: It takes an engineer at least a few hours to deploy all the resources manually, assuming everything works to plan. If errors are encountered, it takes longer. With an automated deployment, the process is much quicker, and it can be done in 15-30 minutes.
  • Easier: The deployment automation encapsulates and hides a lot of the complexity of the process, and the engineer performing the task does not need to know a lot of the different technologies in depth. Also, since the automation has been hardened over time through repeated testing in the lab, much of the error handling has been codified within the scripts.
  • Repeatable: The deployment done via automation always comes out exactly as designed. Unlike manual deployment, where unforced user errors can creep in, the scripts perform each run exactly the same. Also, scripts can be coded to fix broken installs or redeploy solution software.
  • Supportable: Automation scripts can have logging, which makes it easy for support personnel to help in case things don’t go as planned.

There are many technologies that can help automate the deployment of software. These include tools like Chef and Ansible, language-specific package managers like PyPI or npm, and Infrastructure as Code (IaC) tools like CloudFormation or Terraform. For RLCatalyst Research Gateway, which is built on AWS, we picked CloudFormation Templates (CFT) for our IaC needs in combination with plain old shell scripts. Find our deployment scripts on Github.


  • Pre-requisites: We deploy Research Gateway in a standard Virtual Private Cloud (VPC) architecture with both public and private subnets. This VPC can be created using a quickstart available from AWS itself.
  • Infrastructure: The infrastructure is created as five different stacks.
    • Amazon S3 bucket: This is used to hold all the deployment artifacts like CFT templates.
    • AWS Cognito UserPool: This is used for authentication.
    • AWS DocumentDB: This is used to store all persistent data required by Research Gateway.
    • Amazon EC2 Image Builder: Pipelines are created to rebuild Amazon Machine Image (AMI) for the standard catalog items that are AMI-based. This ensures that the AMIs have the latest patches and security fixes.
    • Amazon EC2 (main stack): This hosts the Research Gateway portal.
  • Configuration: Some of the instance-specific data is part of the configuration, which is stored in one of the following ways.
    • Files: Configuration files are created during the deployment process, using data provided at the time. These files are referred by the solution software to customize its behavior. File-based configurations are easier to access for support personnel and can be easily checked in case the solution software is not behaving as expected.
    • Database Entries: A configs collection in the database hosts some of the information. Ideally, all configurations can reside in the database, but because the database is encrypted and has restricted access, we prefer to keep some of the configurations outside the DB.
    • AWS Systems Manager (SSM) Parameter Store: Some configurations, especially those related to AMIs, which are resolved by CFTs at run-time, are maintained in the AWS SSM Parameter store.
  • Research Gateway Solution Software: Distributed as docker images via AWS Elastic Container Registry (ECR). This allows us to distribute the solution software privately to the customers’ AWS accounts. Our solution software runs as a set of docker services. A variation of the deployment script can also deploy this as services into AWS Elastic Kubernetes Service.
  • Load-balancing: The EC2 instances deployed register themselves with Target Groups, and an Application Load Balancer serves the application securely over SSL using certificates hosted in AWS Certificate Manager.

Once the solution software is deployed, and the portal is running and reachable, the first user (an Admin role) is created using a script. Using that Administrator user credentials, the rest of the onboarding process can be completed by the customer from the UI.

Summary
Using the automated deployment process, an instance of the RLCatalyst Research Gateway can be provisioned and configured in less than 30 minutes. This allows customers to start using the solution quickly and derive maximum benefits from their investment with minimum effort.

If you would like to launch your scientific research environment in less than 30 minutes with RLCatalyst Research Gateway or would like to learn more about it, write to us at marketing@relevancelab.com.

References
Architecting a Cloud-based Application with AWS Best Practices
Enabling Frictionless Scientific Research in the Cloud with a 30 Minutes Countdown Now!



0

2022 Blog, Blog, Featured

In recent times, Next Generation Sequencing (NGS) has transformed from being solely a research tool to be routinely applied in many fields, including diagnostics, outbreak disease investigations, antimicrobial resistance, forensics, and food authenticity. The use of cloud and modern open source tools is helping advancement at a rapid pace, with continuous improvement in quality and cost reduction, and is having a major influence on food microbiology. Public health labs and food regulatory agencies globally are embracing Whole Genome Sequencing (WGS) as a revolutionary new method. In this blog, we try to introduce this interesting use case and cover a common use case of Bacterial Genome Analysis in the cloud using our Research Gateway. We will show how to run powerful tools like Bactopia, a Flexible Pipeline for Complete Analysis of Bacterial Genomes in a few minutes.

What is Bactopia?
Sequencing of bacterial genomes is gathering momentum for greater adoption. Bactopia, developed by Robert A. Petit III, was created with a new series of pipelines (acknowledgements) built using Nextflow workflow software to provide efficient comparative genomic analyses for bacterial species or genera. This pipeline has more advanced features than many others in a similar space.

The image below shows the High Level Components of Bactopia Pipeline.


What Makes Bactopia More Powerful Compared to Other Similar Solutions?
The following data shared by the authors of this pipeline highlights the key strengths.



Usually, for researchers to get started with setting up a secure environment, accessing the data, big compute, and analytics tools can be a significant effort. With Research Gateway built on AWS, we make it extremely simple to get started.

An Introduction to Running Bactopia on AWS Cloud
Bactopia is a software pipeline for the complete analysis of bacterial genomes. Bactopia is based on the Nextflow bioinformatic workflow software. Research Gateway supports Nextflow based pipelines to be run with great ease, and we will show you how the same can be achieved with Bactopia.

Steps for Running Bactopia Pipeline on AWS Cloud
Step-1: Using the publicly available Bactopia repository on Github, a new AWS AMI is created by installing Bactopia software on Nextflow advanced product available as part of Research Gateway standard catalog. This step is needed since Bactopia contains a large number of specialized tools integrated and embeds Nextflow internally for its execution. Once the new AMI of Bactopia is ready, it is added to AWS Service Catalog and imported into Research Gateway to be used by Researchers. The product is available in the standard products category to be launched with 1-Click, as shown below.


Step-2: Once the Bactopia product is ordered using a simple screen as shown above in about 10 minutes, the setup with all the tools, Nextflow & Nextflow Tower are all provisioned and ready to be used. The user can log in to the Bactopia server using the SSH key-pair available from within the Portal UI using the “SSH/RDP Connect” action, as shown below.

Step-3: Copy data to the Bactopia server based on samples to be used for processing and start the execution of workflow as per available documentation. In our case, we tried with a smaller set of sample data sets, and it took us 15 min to run the pipeline and view outputs in the console window.

Step-4: When the pipeline is being executed using Nextflow Tower, details of the jobs and all key metrics can be viewed by the user from within the Research Gateway by selecting the “Monitor Pipeline” action. The entire complexity of different tools integrated on the platform is invisible to the user making it a seamless experience.

Step-5: The outputs generated by the Bactopia pipeline can be viewed from within the Portal using the “View Outputs” action that allows users to view the outputs in a simple browser, and actions can be taken to view the same with specialized tools like Integrative Genomics Viewer (IGV) or MultiQC reports, etc.

All the products that are used in Research Gateway have automatic tagging and tracking for cost purposes, and it can be easily verified by project, researchers, and product type on the total consumption providing a powerful cost management and budget tracking tool.

Summary
As the usage of Genomics adoption grows and new use cases emerge for leveraging the power of this technology, focus on food safety is a growing need with an ability to Bacterial Genome analysis using advanced pipelines popularly available in the open source community. To help researchers use such power tools with speed on the cloud without getting into the complexity of infrastructure, networks, security, and focus on science, we have demonstrated in this blog the ability to use Research Gateway to run your first pipeline in less than 60 minutes.

To know more about how you can start your Bacterial Genome analysis pipelines on the AWS Cloud in less than 60 minutes using our solution at https://research.rlcatalyst.com, feel free to contact marketing@relevancelab.com

References
An introduction to running Bactopia on Amazon Web Services (May 2021)
Using AWS Batch to process 67,000 genomes with Bactopia (December 2020)
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution



0

2022 Blog, Blog, Featured

Research Gateway SaaS solution from Relevance Lab provides a next-generation cloud-based platform for collaborative scientific research on AWS with access to research tools, data sets, processing pipelines, and analytics workbenches in a frictionless manner. It takes less than 30 minutes to launch a “MyResearchCloud” working environment for Principal Investigators and Researchers with security, scalability, and cost governance. Using the Software as a Service (SaaS) model is a preferable option for consuming functionality but in the area of scientific research, it is equally critical to have tight control on data security, privacy, and regulatory compliances.

One of the growing needs from customers is to use the solution for their online training needs and specialized use cases on Bioinformatics courses. With the pandemic, there is tremendous new interest in students to pursue life sciences courses and specialize in Bioinformatics streams. At the same time, education institutions are struggling to move their internal Training Labs infrastructure from data centers to the cloud. As an AWS specialized partner for Higher Education, we are working with a number of universities to understand their needs better and provide solutions to address the same in an easy + cost-effective manner.

The Top-5 use cases shared by customers to set up their Virtual Cloud Labs for courses like Bioinformatics are the following:


  • Enterprise Needs: Ability to move from Data Center based physical labs to cloud-based Virtual Labs using their Corporate Cloud accounts easily without compromising on security, tight cost controls, and a self-service portal for Instructors and Students. Enterprise-grade controls on Budget, Students/Instructors Access, Data Security, and Approved Products Catalog.
  • Business Needs: The setup of a New Virtual Training Lab should support the key learning and research needs of the students.
    • Programs available to provide labs access to students based on calendar programs for the duration of the full semester.
    • Longer-term projects and programs accessible for labs based on research grants and associated budgets/time constraints.
  • IT Department Needs: From University Corporate IT to be able to allow specific departments (like Bioinformatics) to have their own Programs and Projects with self-service without compromising on Enterprise Security and Compliance Needs.
  • Curriculum Department Needs: From different Department Heads (like Bioinformatics) and Instructors be able to define learning curriculum and associated training programs with access to Classroom and Research Labs. Departments also need tight control on budgets and student access management.
  • Student Needs: The ability for students to access cloud-based Training Labs is a very easy and simple manner without requiring deep access to cloud knowledge. Also having pre-build solutions for basic needs covering Analytics Tools like RStudio/Jupyter, access to secure data repositories, open-source tools/containers access, and collaboration portal.

The following picture describes the basic organization and roles setup in a university.



To balance the needs of speed with compliance, we have designed a unique model to allow Universities to “Bring your own License” while leveraging the benefits of SaaS in a unique hybrid approach. Our solution provides a “Gateway Model” of Hub-n-Spoke design where we provide and operate the “Hub” while enabling universities and their departments to connect their own AWS Research accounts as a “Spoke” and get started within 30 min with full access to a complete Classroom Toolkit. A sample of out-of-the-box Bioinformatics Lab tools available as a standard catalog is shown below.


Professors can add more tools to the standard catalog by importing their own AMIs using AWS Service Catalog. It is also very simple to create new course material and support additional tools using the base building blocks provided out-of-the-box.

Currently, it is not easy for universities, their IT staff, professors, students, and research groups to leverage the cloud easily for their scientific research. There are constraints with on-premise data centers and these institutions have access to Cloud accounts. However converting a basic account to a secure network, secure access, ability to create & publish product/tools catalog, ingress & egress of data, sharing of analysis, enforce tight budget control are non-trivial tasks that divert attention away from education to infrastructure.

Based on our discussions with stakeholders it was clear that the users want something that is as easy to consume as other consumer-oriented activities like e-shopping, consumer banking, etc. This led to the simplified process of creating a “My -Bioinformatics-Cloud-Lab” with the following basic needs:

1. A university can decide to sign up with Research Gateway (SaaS) to enable their different departments for using this software to enable online training and research needs. Such a university-level adoption is recommended to be an enterprise version of the software (hosted by us or by the university themselves) and used for different departments (called Organization or Business Units).
2. Another simpler way is to use our hosted version of Research Gateway by a particular department to create a tenant in Research Gateway with no overheads to maintain a university-specific deployment.
3. A Head of Department (HOD) can sign-up to create a new Tenant on Research Gateway and configure their own AWS Billing account to create Projects. Each Project can then invite other professors to be part of the online Training Labs. Projects can be aligned with semester-based classroom lab needs or can be part of ongoing research projects. Each project has a budget assigned along with associated professors and students, who have access to the project. The figure below shows typical department projects inside the portal.


4. Once the professor selects the project they can see standard “available products” in the Project. This project is used as a basic setup for a Training Lab. The figure below shows the sample screen for the available set of tools Professors can access by default. They can also add new products to the Lab Catalog.


For every Project (Lab) by default shared infrastructure is made available in the form of Project Storage, where curriculum-related data and information can be stored and made available to all students. Also, necessary security aspects for SSL connection, VPC, IAM roles, etc. are setup by default to make sure the Cloud Training Lab has a well-architected design.

5. A professor can control basic parameters for the Lab in terms of adding/deleting users, managing budgets, and also be able to take actions like “Pausing” a Project (no new products can be created while existing ones can be used) or “Stopping” the project (where all existing running machines are force stopped and no new ones can be created, however, data on the storage is accessible by students). The figure below shows how to manage project-level users and budget controls.


6. A professor can track the consumption of the lab resources by all users including professors and students as shown in the figure below.


7. Once a student logs into the project and accesses the lab resources, they can create their own workspaces like Rstudio and interact with the same from within the Portal. Once they are done with their work, they can stop the machine and log out to ensure no costs are being spent while the systems are not being used. When a researcher or student logs in, they can interact with active products and project storage as shown in the figure below.


8. The students can interact with their tools like RStudio from within the portal and connect to the same in a secure manner with a single click as shown in the figure below.


9. On Clicking the “Open Link” action, it allows access to an RStudio familiar environment for students to log in and learn as per their curriculum needs. The figure below shows the standard RStudio environment.


Summary
The new solution from Relevance Lab makes Scientific Research and Training in Cloud very easy for use cases like Bioinformatics. It provides flexibility, cost management, and secure collaborations to truly unlock the potential of the Cloud. For Higher Education Universities, this provides a fully functional Training Lab accessible by professors and students in less than 30 minutes.

If this seems exciting and you would like to know more or try this out do write to us at marketing@relevancelab.com

References
University in a Box – Mission with Speed
Leveraging AWS HPC for Accelerating Scientific Research on Cloud
Enabling Frictionless Scientific Research in the Cloud with a 30 Minutes Countdown Now!



0

2022 Blog, Research Gateway, Blog, Featured

As a researcher, do you want to get started in minutes to run any complex genomics pipeline with large data sets without worrying about hours to set up the environment, dealing with large data sets availability & storage, security of your cloud infrastructure, and most of all unknown expenses? RLCatalyst makes your life simpler, and in this blog, we will cover how easy it is to use publicly available Genomics pipelines from nf-co.re using Nextflow on your AWS Cloud environment with ease.

There are a number of open-source tools available for researchers driving re-use. However, what Research Institutions and Genomics companies are looking for is a right balance on three key dimensions before adopting cloud in a large scale manner for internal use:

  • Cost and Budget Governance: Strong focus on Cost Tracking of Cloud resources to track, analyze, control, and optimize budget spends.
  • Research Data & Tools Easy Collaboration: Principal Investigators and researchers need to focus on data management, governance, and privacy along with analysis and collaboration in real-time without worrying about Cloud complexity.
  • Security and Compliance: Research requires a strong focus on security and compliance covering Identity management, data privacy, audit trails, encryption, and access management.

To make sure the above functionalities do not slow down researchers from focussing on Science due to complexities of infrastructure, Research Gateway provides the reliable solution by automating cost & budget tracking with safe-guards and providing a simple self-service model for collaboration. We will demonstrate in this blog how researchers can use a vast set of publicly available tools, pipelines and data easily on this platform with tight budget controls. Here is a quick video of the ease with which researchers can get started in a frictionless manner.

nf-co.re is a community effort to collect a curated set of analysis pipelines built using Nextflow. The key aspects of these pipelines are that these pipelines adhere to strict guidelines that ensure they can be reused extensively. These pipelines have following advantages:


  • Cloud-Ready – Pipelines are tested on AWS after every release. You can even browse results live on the website and use outputs for your own benchmarking.
  • Portable and reproducible – Pipelines follow best practices to ensure maximum portability and reproducibility. The large community makes the pipelines exceptionally well tested and easy to run.
  • Packaged software – Pipeline dependencies are automatically downloaded and handled using Docker, Singularity, Conda, or others. No need for any software installations.
  • Stable releases – nf-core pipelines use GitHub releases to tag stable versions of the code and software, making pipeline runs totally reproducible.
  • CI testing – Every time a change is made to the pipeline code, nf-core pipelines use continuous integration testing to ensure that nothing has broken.
  • Documentation – Extensive documentation covering installation, usage, and description of output files ensures that you won’t be left in the dark.

Sample of commonly used pipelines that are supported out-of-box in Research Gateway to run with a few clicks and do important genomic analysis. While publicly available repos are easily accessible, it also allows private repositories and custom pipelines to run with ease.


Pipeline Name Description Commonly used for
Sarek Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling, and annotation) from Whole Genome Sequencing (WGS) / targeted sequencing Variant Analysis – workflow designed to detect variants on whole genome or targeted sequencing data
RNA-Seq RNA-Sequencing analysis pipeline using STAR, RSEM, HISAT2, or Salmon with gene/isoform counts and extensive quality control Common basic analysis for RNA-Sequencing with a reference genome and annotation
Dual RNA-Seq Analysis of Dual RNA-Seq data – an experimental method for interrogating host-pathogen interactions through simultaneous RNA-Seq Specifically used for the analysis of Dual RNA-Seq data, interrogating host-pathogen interactions through simultaneous RNA-Seq
Bactopia Bactopia is a flexible pipeline for complete analysis of bacterial genomes Bacterial Genomic Analysis with focus on Food Safety
Viralrecon Assembly and intrahost/low-frequency variant calling for viral samples Supports metagenomics and amplicon sequencing data derived from the Illumina sequencing platform

*The above samples can be launched in less than 5 min and take less than $5 to run with test data and 80% productivity gains achieved.

The figure below shows the building block of this solution on AWS Cloud.


Steps for running nf-core pipeline with Nextflow on AWS Cloud


Steps Details Time Taken
1. Log into RLCatalyst Research Gateway as a Principal Investigator or Researcher profile. Select the project for running Genomics Pipelines, and first time create a new Nextflow Advanced Product. 5 min
2. Select the Input Data location, output data location, pipeline to run (from nf-co.re), and provide parameters (container path, data pattern to use, etc.). Default parameters are already suggested for use of AWS Batch with Spot instances and all other AWS complexities abstracted from end-user for simplicity. 5 min to provision new Nextflow & Nextflow Tower Server on AWS with AWS Batch setup completed with 1-Click
3. Execute Pipeline (using UI interface or by SSH into Head-node) on Nextflow Server. There is ability to run the new pipelines, monitor status, and review outputs from within the Portal UI. Pipelines can take some time to run depending on the size of data and complexity
4. Monitor live pipelines with the 1-Click launch of Nextflow Tower integrated with the portal. Also, view outputs of the pipeline in outputs S3 bucket from within the Portal. Use specialized tools like MultiQC, IGV, and RStudio for further analysis. 5 min
5. All costs related to User, Product, and Pipelines are automatically tagged and can be viewed in the Budgets screen to know the Cloud spend for pipeline execution that includes all resources, including AWS Batch HPC instances dynamically provisioned. Once the pipelines are executed, the existing Cromwell Server can be stopped or terminated to reduce ongoing costs. 5 min

The figure below shows the Nextflow Architecture on AWS.


Summary
nf-co.re community is constantly striving to make Genomics Research in the Cloud simpler. While these pipelines are easily available, running them on AWS Cloud with proper cost tracking, collaboration, data management, and integrated workbench were missing that is now solved by Research Gateway. Relevance Lab, in partnership with AWS, has addressed this need with their Genomics Cloud solution to make scientific research frictionless.

To know more about how you can start your Nextflow nf-co.re pipelines on the AWS Cloud in 30 minutes using our solution at https://research.rlcatalyst.com, feel free to contact marketing@relevancelab.com

References
Enabling Researchers with Next-Generation Sequencing (NGS) Leveraging Nextflow and AWS
Pipelining GATK with WDL and Cromwell on AWS Cloud
Genomics Cloud on AWS with RLCatalyst Research Gateway
Health Informatics and Genomics on AWS with RLCatalyst Research Gateway
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution



0

PREVIOUS POSTSPage 1 of 2NO NEW POSTS