Your address will show here +12 34 56 78
2021 Blog, Blog, Featured

We aim to enable the next-generation cloud-based platform for collaborative research on AWS with access to research tools, data sets, processing pipelines, and analytics workbench in a frictionless manner. It takes less than 30 minutes to launch a “MyResearchCloud” working environment for Principal Investigators and Researchers with security, scalability, and cost governance. Using the Software as a Service (SaaS) model is a preferable option for Scientific research in the cloud with tight control on data security, privacy, and regulatory compliances.

Typical top-5 use cases we have found for MyResearchCloud as a suitable solution for unlocking your Scientific Research needs:

  • Need an RStudio solution on AWS Cloud with an ability to connect securely (using SSL) without having to worry about managing custom certificates and their lifecycle
  • Genomic pipeline processing using Nextflow and Nextflow Tower (open source) solution integrated with AWS Batch for easy deployment of open source pipelines and associated cost tracking per researcher and per pipeline
  • Enable researchers with EC2 Linux and Windows servers to install their specific research tools and software. Ability to add AMI based researcher tools (both private and from AWS Marketplace) with 1-click on MyResearchCloud
  • Using SageMaker AI/ML Workbench drive Data research (like COVID-19 impact analysis) with available public data sets already on AWS cloud and create study-specific data sets
  • Enable a small group of Principal Investigator and researchers to manage Research Grant programs with tight budget control, self-service provisioning, and research data sharing

MyResearchCloud is a solution powered by RLCatalyst Research Gateway product and provides the basic environment with access to data, workspaces, analytics workbench, and cloud pipelines, as explained in the figure below. ​


Currently, it is not easy for research institutes, their IT staff, and a group of principal investigators & researchers to leverage the cloud easily for their scientific research. While there are constraints with on-premise data centers and these institutions have access to cloud accounts, converting a basic account to one with a secured network, secured access, ability to create & publish product/tools catalog, ingress & egress of data, sharing of analysis, tight budget control and other non-trivial tasks divert the attention away from ‘Science’ to ‘Servers’.

We aim to provide a standard catalog for researchers out-of-the-box solution with an ability to also bring your own catalog, as explained in the figure below.


Based on our discussions with research stakeholders, especially small & medium ones, it was clear that the users want something as easy to consume as other consumer-oriented activities like e-shopping, consumer banking, etc. This led to the simplified process of creating a “MyResearchCloud” with the following basic needs:


  • This “MyResearchCloud” is more suitable for smaller research institutions with a single or a few groups of Principal Investigators (PI) driving research with few fellow researchers.
  • The model to set up, configure, collaborate, and consume needs to be extremely simple and comes with pre-built templates, tools, and utilities.
  • PI’s should have full control of their cloud accounts and cost spends with dynamic visibility and smart alerts.
  • At any point, if the PI decides to stop using the solution, there should be no loss to productivity and preservation of existing compute & data.
  • It should be easy to invite other users to collaborate while still controlling their access and security.
  • Users should not be loaded with technical jargon while ordering simple products for day-to-day research using computation servers, data repositories, analysis IDE tools, and Data processing pipelines.

Based on the above ask, the following simple steps have been enabled:


Steps to Launch Activity Total time from Start
Step-1 As a Principal Investigator, create your own “MyResearchCloud” by using your Email ID or Google ID to login the first time on Research Gateway. 1 min
Step-2 If using a personal email ID, get an activation link and login for the first time with a secure password. 4 min
Step-3 Use your own AWS account and provide secure credentials for “MyResearchCloud” consumption. 10 min
Step-4 Create a new Research Project and set up your secure environment with default networking, secure connections, and a standard catalog. You can also leverage your existing setup and catalog. 13 min
Step-5 Invite new researchers or start using the new setup to order your products to get started with a catalog covering data, compute, analytic tools, and workflow pipeline. 15 min
Step-6 Order the necessary products – EC2, S3, Sagemaker/RStudio, Nextflow pipelines. Use the Research Gateway to interact with these tools without the need to access AWS Cloud console for PI and Researchers. 30 min


The picture below shows the easy way to get started with the new Launchpad and 30 minutes countdown.


Architecture Details
To balance the needs of Speed with Compliance, we have designed a unique model to allow Researchers to “Bring your own License” while leveraging the benefits of SaaS in a unique hybrid approach. Our solution provides a “Gateway” model of hub-and-spoke design where we provide and operate the “Hub” while enabling researchers to connect their own AWS Research accounts as a “Spoke”.

Security is a critical part of the SaaS architecture with a hub-and-spoke model where the Research Gateway is hosted in our AWS account using best practices of Cloud Management & Governance controlled by AWS Control Tower while each tenant is created using AWS security best practices of minimum privileges access and role-based access so that no customer-specific keys or data are maintained in the Research Gateway. The architecture and SaaS product are validated as per AWS ISV Path program for Well-Architected principles and data security best practices.

The following diagram explains in more detail the hub-and-spoke design for the Research Gateway.


This de-coupled design makes it easy to use a Shared Gateway while connecting your own AWS Account for consumption with full control and transparency in billing & tracking. For many small and mid-sized research teams, this is the best balance between using a third-party provider-hosted account and having their own end-to-end setup. This structure is also useful for deploying a hosted solution covering multiple group entities (or conglomerates), typically covering a collaborative network of universities working under a central entity (usually funded by government grants) in large-scale genomics grants programs. For customers who have more specific security and regulatory needs, we do allow both the hub-and-spoke deployment accounts to be self-hosted. The flexible architecture can be suitable for different deployment models.


AWS Services that MyResearchCloud uses for each customer:


Service Needed for Secure Research Solution Provided Run Time Costs for Customers
Need for DNS-based friendly URL to access MyResearchCloud SaaS RLCatalyst Research Gateway No additional costs
Secure SSL-based connection to my resources AWS ACM Certificates used and AWS ALB created for each Project Tenant AWS ALB implemented smartly to create and delete based on dependent resources to avoid fixed costs
Network Design Default VPC created for new accounts to save users trouble of network setups No additional costs
Security Role-based access provided to RLCatalyst Research Gateway with no keys stored locally No additional costs. Users can revoke access to RLCatalyst Research Gateway anytime.
IAM Roles AWS Cognito based model for Hub No additional costs for customers other than SaaS user-based license
AWS Resources Consumption Directly consumed based on user actions. Smart features are available by default with 15 min auto-stop for idle resources to optimize spends. Actual usage costs that is also suggested for optimization based on Spot instances for large workloads
Research Data Storage Default S3 created for Projects with the ability to have shared Project Data and also create private Study Data. Ability to auto-mount storage for compute instances with easy access, backup, and sync with base AWS costs
AWS Budgets and Cost Tracking Each project is configured to track budget vs. actual costs with auto-tagging for researchers. Notification and control to pause or stop consumption when budgets are reached. No additional costs.
Audit Trail All user actions are tracked in a secure audit trail and are visible to users. No additional costs
Create and use a Standard Catalog of Research Products Standard Catalog provided and uploaded to new projects. Users can also bring their own catalogs No additional costs.
Data Ingress and Egress for Large Data Sets Using standard cloud storage and data transfer features, users can sync data to Study buckets. Small set of files can also be uploaded from the UI. Standard cloud data transfer costs apply

In our experience, research institutions can enable new groups to use MyResearchCloud with small monthly budgets (starting with US $100 a month) and scale their cloud resources with cost control and optimized spendings.

Summary
With an intent to make Scientific Research in the cloud very easy to access and consume like typical Business to Consumer (B2C) customer experiences, the new “MyResearchCloud” model from Relevance Lab enables this ease of use with the above solution providing flexibility, cost management, and secure collaborations to truly unlock the potential of the cloud. This provides a fully functional workbench for researchers to get started in 30 minutes from a “No-Cloud” to a “Full-Cloud” launch.

If this seems exciting and you would like to know more or try this out, do write to us at marketing@relevancelab.com.

Reference Links
Driving Frictionless Research on AWS Cloud with Self-Service Portal
Leveraging AWS HPC for Accelerating Scientific Research on Cloud
RLCatalyst Research Gateway Built on AWS
Health Informatics and Genomics on AWS with RLCatalyst Research Gateway
How to speed up the GEOS-Chem Earth Science Research using AWS Cloud?
RLCatalyst Research Gateway Demo
AWS training pathway for researchers and research IT



0

2021 Blog, AppInsights Blog, ServiceOne, Blog, Featured

Relevance Lab announces the availability of a new product RLCatalyst AppInsights on ServiceNow Store. The certified standalone application will be available free of cost and offers a dynamic application-centric view of AWS resources.

Built on top of AWS Service Catalog AppRegistry and created in consultations with AWS Teams, the product offers a unique solution for ServiceNow and AWS customers. It offers dynamic insights related to cost, health, cloud asset usage, compliance, and security with the ability to take appropriate actions for operational excellence. This helps customers to manage their multi-account dynamic application CMDB (Configuration Management Database).

The product includes ServiceNow Dashboards with metrics and actionable insights. The design has pre-built connectors to AWS services and unique RL DataBridge that provides integration to third-party applications using serverless architecture for extended functionality.

Why do you need a Dynamic Application-Centric View for Cloud CMDB?
Cloud-based dynamic assets create great flexibility but add complexity for near real-time asset and CMDB tracking, especially for enterprises operating in a complex multi-account, multi-region, and multi-application environment. Such enterprises with complex cloud infrastructures and ITSM tools, struggle to change the paradigm from infrastructure-centric views to application-centric insights that are better aligned with business metrics, financial tracking and end user experiences.

While existing solutions using Discovery tools and Service Management connectors provided a partial solution to an infrastructure-centric view, a robust Application Centric Dynamic CMDB was a missing solution that is now addressed with this product. More details about the features of this product can be found on this blog.

Built on AWS Service Catalog AppRegistry
AWS Service Catalog AppRegistry helps to create a repository of your applications and associated resources. These capabilities enable enterprise stakeholders to obtain the information they require for informed strategic and tactical decisions about cloud resources.

Leveraging AWS Service Catalog AppRegistry as the foundation for the application-centric views, RLCatalyst AppInsights enhances the value proposition and provides integration with ServiceNow.

Value adds provided:

  • Single pane of control for Cloud Operational Management with ServiceNow
  • Cost planning, tracking, and optimization across multi-region and complex cloud setups
  • Near real-time view of the assets, health, security, and compliance
  • Detection of idle capacity and orphaned resources
  • Automated remediation

This enables the entire lifecycle of cloud adoption (Plan, Build and Run) to be managed with significant business benefits of speed, compliance, quality, and cost optimization.

Looking Ahead
With the new product now available on the ServiceNow store, it makes easier for enterprises to download and try this for enhanced functionality on existing AWS and ServiceNow platforms. We expect to work closely with AWS partnership teams to drive the adoption of AWS Service Catalog AppRegistry and solutions for TCAM (Total Cost of Application Management) in the market. This will help customers optimize their application assets tracking and cloud spends by better planning, monitoring, analyzing and corrective actions, through an intuitive UI-driven ServiceNow application at no additional costs.

To learn more about RLCatalyst AppInsight, feel free to write to marketing@relevancelab.com.



0

2021 Blog, SPECTRA Blog, Blog, Featured

Oracle Fusion is an invaluable support to many businesses for managing their transaction data. However, business users would be familiar with limitations when it comes to generating even moderately complex analyses and reports involving a large volume of data. In a Big Data-driven world, this can become a major competitive disadvantage. Relevance Lab has designed SPECTRA, a Hadoop-based platform, that makes Oracle Fusion reports simple, quick, and economical even when working with billions of transaction records.

Challenges with Oracle Fusion Reporting
Oracle Fusion users often struggle to extract reports from large transactional databases. Key issues include:


  • Inability to handle large volumes of data to generate accurate reports within reasonable timeframes.
  • Extracting integrated data from different modules of the ERP is not easy. It requires manual effort for synthesizing fragmented reports, which makes the process time-consuming, costly, and error-prone. Similar problems arise when trying to combine data from the ERP with that from other sources.
  • The reports are static, not permitting a drill down on the underlying drivers of reported information.
  • There are limited self-service operations, and business users have to rely heavily on the IT department for building new reports. It is not uncommon for weeks and months to pass between the first report request and the availability of the report.

Moreover, Oracle has stopped supporting its reporting tool Discoverer from 2017, creating additional challenges for users that continue to rely on it.

How RL SPECTRA can Help
Relevance Lab recognizes the value to its clients of generating near real-time dynamic insights from large, ever-growing data volumes at reasonable costs. With that in mind, we have developed an Enterprise Data Lake (EDL) platform, SPECTRA, that automates the process of ingesting and processing huge volumes of data from the Oracle Cloud.

This Hadoop-based solution has advantages over traditional data warehouses and ETL solutions due to its:


  • superior performance through parallel processing capability and robustness when dealing with large volumes of data,
  • rich set of components like Spark, AI/ML libraries to derive insights from big data,
  • a high degree of scalability,
  • cost-effectiveness, and
  • ability to handle semi-structured and unstructured data.

After the initial data ingestion into the EDL, incremental data ingestion uses delta refresh logic to minimize the time and computing resources spent on ingestion.

SPECTRA provides users access to raw data (based on authorization) empowering them to understand and analyze data as per their requirement. It enables users to filter, sort, search & download up to 6 million records at one go. The platform is also capable of visualizing data in charts, apart from being compatible with standard dashboard tools.

This offering combines our deep Oracle and Hadoop expertise with extensive experience across industries.
With this solution, we have helped companies generate critical business reports from massive volumes of underlying data delivering substantial improvement in extraction and processing time, quality, and cost-effectiveness.


Use Case: Productivity Enhancement through Optimised Reporting for a Publishing Major
A global publishing major that had recently deployed Oracle Fusion Cloud applications for inventory, supply chain, and financial management discovered that these were inadequate to meet its complex reporting and analytical requirements.


  • The application was unable to accurately process the company’s billion-plus transaction records on the Oracle Fusion Cloud to generate a report on the inventory position.
  • It was also challenging to use an external tool to do this as it would take several days to extract data from the Oracle cloud to an external source while facing multiple failures during the process.
  • This would make the cost and quality reconciliation of copies of books lying in different warehouses and distribution centres across the world very difficult and time-consuming, as business users did not have on-time and accurate visibility of the on-hand quantity.
  • In turn, this had adverse business consequences such as inaccurate planning, higher inventory costs, and inefficient fulfilment.

The company reached out to Relevance Lab for a solution. Our SPECTRA platform automated and optimized the process of data ingestion, harmonization, transformation, and processing, keeping in mind the specific circumstances of the client. The deployment yielded multiple benefits:


  • On-Hand quantity and costing reports are now generated in less than an hour
  • Users can access raw data as well as multiple reports with near real-time data, giving them full flexibility and making the business more responsive to market dynamics
  • Overall, user efforts stand reduced by 150 hours per person per quarter by using SPECTRA for their inventory report, leading to higher productivity
  • With all the raw data in SPECTRA, several reconciliation procedures are in place to identify missing data between the Oracle cloud and its legacy system

The Hadoop-based architecture can be scaled flexibly in response to the continuously growing size of the transaction database and is also compatible with the client’s future technology roadmap.


Conclusion
RL’s big-data platform, SPECTRA, offers an effective and efficient future-ready solution to the reporting challenges in Oracle Fusion when dealing with large data sets. SPECTRA enables clients to access near real-time insights from their big data stored on the Oracle Cloud while delivering substantial cost and time savings.

To know more about our solutions or to book a call with us, please write to marketing@relevancelab.com.



0

2021 Blog, Blog, Featured

Bioinformatics is a field of computational science that involves the analysis of sequences of biological molecules (DNA, RNA, or protein). It’s aimed at comparing genes and other sequences within an organism or between organisms, looking at evolutionary relationships between organisms, and using the patterns that exist across DNA and protein sequences to elucidate their function. Being an interdisciplinary branch of the life sciences, bioinformatics integrates computer science and mathematical methods to reveal the biological significance behind the continuously increasing biological data. It does this by developing methodology and analysis tools to explore the large volumes of biological data, helping to query, extract, store, organize, systematize, annotate, visualize, mine, and interpret complex data.

The advances in Cloud computing and availability of open source genomic pipeline tools have provided researchers powerful tools to speed up processing of next-generation sequencing. In this blog, we explain leveraging the RLCatalyst Research Gateway portal to help researchers focus on science and not servers while dealing with NGS and popular pipelines like RNA-Seq.

Steps and Challenges of RNA-Seq Analysis
Any Bioinformatics analysis involving next-generation Sequencing, RNA-Seq (named as an abbreviation of “RNA Sequencing”) constitutes of these following steps:


  • Mapping of millions of short sequencing reads to a reference genome, including the identification of splicing events
  • Quantifying expression levels of genes, transcripts, and exons
  • Differential analysis of gene expression among different biological conditions
  • Biological interpretation of differentially expressed genes

As seen from the figure below, the RNA-Seq analysis for identification of differentially expressed genes can be carried out in one of the three (A, B, C) protocols, involving different sets of bioinformatics tools. In study A, one might opt for TopHat, STAR, and HISAT for alignment of sequences and HTSeq for quantification, whereas the same set of steps can be performed by using Kalisto and Salmon tools (Study B) or in combination with CuffLinks (Study C) all of these yields the same results which are further used in the identification of differentially expressed genes or transcripts.


Each of these individual steps is executed using a specific bioinformatics tool or set of tools such as STAR, RSEM, HISAT2, or Salmon for gene isoform counting and extensive quality control of the sequenced data. The major bottlenecks in RNA-Seq data analysis include manual installations of software, deployment platforms, or computational capacity and cost.

Looking at the vast number of tools available for a single analysis and different versions and their compatibility makes the setup tricky. This can also be time-consuming as proper configuration and version compatibility assessment take several months to complete.

Nextflow: Solution to Bottleneck
The most efficient way to tackle these hurdles is by making use of Nextflow based pipelines that support cloud computing where virtual systems can be provisioned at a fraction of the cost, and the setup is seemingly smoother that can be done by a single individual, as well as support for container systems (Docker and Singularity).

Nextflow is a reactive workflow framework and a programming DSL (Domain Specific Language) that eases the writing of data-intensive computational pipelines.

As seen in the diagram below, the infrastructure to use Nextflow in the AWS cloud consists of a head node (EC2 instance with Nextflow and Nextflow Tower open source software installed) and wired to an AWS Batch backend to handle the tasks created by Nextflow. AWS Batch creates worker nodes at run-time, which can be either on-demand instances or spot instances (for cost-efficiency). Data is stored in an S3 bucket to which the worker nodes in AWS Batch connect and pull the input data. Interim data and results are also stored in S3 buckets, as is the output. The pipeline to be run (e.g. RNA-Seq, DualRNA-Seq, ViralRecon, etc.) is pulled by the worker nodes as a container image from a public repo like DockerHub or BioContainers.

RLCatalyst Research Gateway takes care of provisioning the infrastructure (EC2 node, AWS Batch compute environment, and Job Queues) in the AWS cloud with all the necessary controls for networking, access, data security, and cost and budget monitoring. Nextflow takes care of creating the job definitions and submitting the tasks to Batch at run-time.

The researcher initiates the creation of the workspace from within the RLCatalyst Research Gateway portal. There is a wide selection of parameters as input including, which pipeline to run, tuning parameters to control the sizing and cost-efficiency of the worker nodes, the location of input and output data, etc. Once the infrastructure is provisioned and ready, the researcher can connect to the head node via SSH and launch Nextflow jobs. The researcher can also connect to the Nextflow Tower UI interface to monitor the progress of jobs.


The pre-written Nextflow pipelines can be pulled from an nf-core GitHub repository and can be set up within minutes allowing the entire analysis to run using a single line command, and the results of each step are displayed on the command line/shell. Configuration of the resources on the cloud is seamless as well, since Nextflow based pipelines provide support for batch computing, enabling the analysis to scale as it progresses. Thus, the researchers can focus on running the pipeline and analysis of output data instead of investing time in setup and configurations.

As seen from the pipeline output (MultiQC) report of the Nextflow-based RNA-Seq pipeline below, we can identify the sequence quality by looking at FastQC scores, identify duplication scenarios based on the contour plots as well as pinpoint the genotypic biotypes along with fragment length distribution for each sample.


RLCatalyst Research Gateway enables the setting up and provisioning of AWS cloud resources with few simple clicks for such analysis, and the output of each run is saved in a S3 bucket enabling easy data sharing. These provisioned resources are pre-configured setups with a proper design template and security architecture and added to these features. RLCatalyst Research Gateway enables cost tracking for the currently running projects, which can be paused/ stopped or deleted as per convenience.

Steps for Running Nextflow-Based Pipelines in AWS Cloud for Genomic Research
Prerequisites for a researcher before starting data analysis.

  • A valid AWS account and access to the RLCatalyst Research Gateway portal
  • A publicly accessible S3 bucket with large Research Data sets accessible

Once done, below are the steps to execute this use case.

  • Login to the RLCatalyst Research Gateway Portal and select the project linked to your AWS account
  • Launch the Nextflow-Advanced product
  • Login to the head node using SSH (Nextflow software will already be installed on this node)
  • In the pipeline folder, modify the nextflow.config file to set the data location according to your needs (Github repo, S3 bucket, etc.). This can also be passed via the command line
  • Run the Nextflow job on the head node. This should automatically cause Nextflow to submit jobs to the AWS Batch backend
  • Output data will be copied to the Output bucket specified
  • Once done, terminate the EC2 instance and check for the cost spent on the use case
  • All costs related to the Nextflow project and researcher consumption are tracked automatically

Key Points

  • Bioinformatics involves developing methodology and analysis tools to analyze large volumes of biological data
  • Vast number of tools available for a single analysis and their compatibility make the analysis setup tricky
  • RLCatalyst Research Gateway enables the setting up and provisioning of Nextflow based pipelines and AWS cloud resources with few simple clicks

Summary
Researchers need powerful tools for collaboration and access to commonly used NGS pipelines with large data sets. Cloud computing makes it much easier with access to workflows, data, computation, and storage. However, there is a learning curve for researchers to use Cloud-specific knowhow and how to use resources optimally for large-scale computations like RNA-Seq analysis pipelines that can also be quite costly. Relevance Lab working closely with AWS partnership has provided RLCatalyst Research Gateway portal to use commonly used pre-built Nextflow pipeline templates and integration with open source repositories like nf-core and biocontainers. RLCatalyst Research Gateway enables execution of such Nextflow-based scalable pipelines on the cloud with few clicks and configurations with cost tracking and resource execution control features. By using AWS Batch the solution is very scalable and optimized for on-demand consumption.

For more information, please feel free to write to marketing@relevancelab.com.

References



0

2021 Blog, AWS Governance, Blog, Featured, thank you

As enterprises continue to rapidly adopt AWS cloud, the complexity and scale of operations on the AWS have increased exponentially. Enterprises now operate hundreds and even thousands of AWS accounts to meet their enterprise IT needs. With this in mind, AWS Management & Governance has emerged as a major focus area that enterprises need to address in a holistic manner to ensure efficient, automated, performant, available, secure, and compliant cloud operations.


Governance360 integrated with Dash ComplyOps

Governance360
Relevance Lab has recently launched its Governance360 professional services offering in the AWS Marketplace. This offering builds upon Relevance Lab’s theme of helping customers adopt AWS the right way.

Governance360 brings together the framework, tooling, and process for implementing a best-practices-based AWS Management & Governance at scale for multi-account AWS environments. It helps clients seamlessly manage their “Day after Cloud” operations on an ongoing basis. The tooling that would be leveraged for implementing Governance360 can include AWS’s native tools, services, RL’s tools, and third-party industry tools.

Typically a Governance360 type of professional service is engaged either during or after the phase of customers’ transition to AWS cloud (Infra & application migration or development on AWS Cloud).

Dash ComplyOps Platform
Dash ComplyOps platform enables and automates the lifecycle of a client’s journey for compliance of their AWS environments towards industry-specific compliance requirements such as HIPAA, HITRUST, SOC2, GDPR. Dash ComplyOps platform provides organizations with the ability to manage a robust cloud security program through the implementation of guardrails and controls, continuous compliance monitoring, and advanced reporting and remediation of security and compliance issues.


Relevance Lab and Dash Solutions have partnered together to bring an end-to-end solution and professional service offering that helps customers realize an automated AWS Management & Governance posture for their environments meeting regulatory compliance needs.
As a part of this partnership, the Dash ComplyOps platform is integrated within the overall Governance360 framework. The detailed mapping of features, tooling, and benefits (including Dash ComplyOps as a tool) across Governance360’s major topic areas is articulated in the table below.

Benefits


Governance360 Topic Benefits
Automation Lifecycle
  • Automate repetitive and time-consuming tasks
  • Automated setup of environments for common use cases such as regulatory, workloads, etc
  • Codify best practices learned over time
  • Control Services
  • Automated & Standardized Account Provisioning
  • Cost & Budget Management
  • Architecture for Industry Standard Compliance, Monitoring, and Remediation
  • Disaster Recovery
  • Automated & Continuous Compliance Monitoring, Detection, and Remediation
  • Proactive Monitoring
  • Dashboards for monitoring AWS Environment from infrastructure to application
  • Security Management
  • Ease of Deployment of Security Controls @ Scale using CI/CD Pipeline
  • Infra and Application Security Threat Monitoring, Prevention, Detection & Remediation
  • Service & Asset Management
  • Software and Asset Management practice with real-time CMDB for Applications & Infrastructure
  • Incident management and auto-remediation
  • Workload Migration & Management
  • Best practices-based workload migration and implementations on AWS cloud
  • Regulatory Compliance
  • Compliance with industry regulatory standards


  • Engagement Flow

    Engagement Flow / Phase Details Typical Duration*
    Discovery & Assessment
  • Understand current state, data, management & governance goals
  • 1-3 weeks
    Analysis & Recommendation
  • Requirement analysis, apply Governance360 framework, create recommendations & obtain client sign off
  • Recommendations include services, tools, and dashboards & expected outcomes, benefits
  • Use of native AWS services, RL’s monitoring & BOTs, Dash ComplyOps platform, and other 3rd party tools
  • 1-3 weeks
    Implementation
  • Implement, test, UAT & production cutover of recommended services, tools, and dashboards
  • 2-8 weeks
    Hypercare
  • Post-implementation support – monitor and resolve any issues faced
  • 1-2 weeks

    * Duration depends on the complexity and scope of the requirements.

    Summary
    Relevance Lab is a consulting partner of AWS and helps organizations achieve automation-led Cloud Management using Governance360, based on the best practices of AWS. For enterprises with regulatory compliance needs, integration with the Dash ComplyOps platform provides an advanced setup for operation in a multi-account environment. While enterprises can try to build some of these solutions, it is both time-consuming and error-prone and demands a specialist partner. Relevance Lab has helped multiple enterprises with this need and has a reusable automated solution and pre-built library to meet the security and compliance needs of any organization.

    For more details, please feel free to contact marketing@relevancelab.com.

    References
    Governance 360 – Are you using your AWS Cloud “The Right Way”
    AWS Security Governance for enterprises “The Right Way”
    Dash ComplyOps by Dash Solutions
    Governance360 available on AWS Marketplace



    0

    2021 Blog, SWB Blog, Blog, Featured

    Provide researchers access to secure RStudio instances in the AWS cloud by using Amazon issued certificates in AWS Certificate Manager (ACM) and an Application Load Balancer (ALB)

    Cloud computing offers the research community access to vast amounts of computational power, storage, specialized data tools, and public data sets, collectively referred to as Research IT, with the added benefit of paying only for what is used. However, researchers may not be experts in using the AWS Console to provision these services in the right way. This is where software solutions like Service Workbench on AWS (SWB) make it possible to deliver scientific research computing resources in a secure and easily accessible manner.

    RStudio is a popular software used by the Scientific Research Community and supported by Service Workbench. Researchers use RStudio very commonly in their day-to-day efforts. While RStudio is a popular product, the process of installing RStudio securely on AWS Cloud and using it in a cost-effective manner is a non-trivial task, especially for Researchers. With SWB, the goal is to make this process very simple, secure, and cost-effective for Researchers so that they can focus on “Science” and not “Servers” thereby increasing their productivity.

    Relevance Lab (RL), in partnership with AWS, set out to make the experience of using RStudio with Service Workbench on AWS simple and secure.

    Technical Solution Goals

    1. A researcher should be able to launch an RStudio instance in the AWS cloud from within the Service Workbench portal.
    2. The RStudio instance comes fully loaded with the latest version of RStudio and a variety of other software packages that help in scientific research computing.
    3. The user launches a URL to the RStudio from within the Service Workbench. This URL is a unique URL generated by SWB and is encoded with an authentication token that ensures that the researcher can access the RStudio instance without remembering any passwords. The URL is served over SSL so that all communications can be encrypted in transit.
    4. Maintaining the certificates used for SSL communication should be cost-effective and should not require excessive administrative efforts.
    5. The solution should provide isolation of researcher-specific instances using allowed IP lists controlled by the end-user.

    Comparison of Old and New Design Principles to make Researcher Experience Frictionless
    The following section summarizes the old design and the new architecture to make the entire researcher experience frictionless. Based on feedback from researchers, it was felt that the older design required a lot of setup complexity and lifecycle upgrades for security certificate management, slowing down researchers productivity. The new solution makes the lifecycle simple and frictionless along with smart and innovative features to keep ongoing costs optimized.


    No. RStudio Feature Original Design Approach New Design Approach
    1 User Generated Security Certificate for SSL Secure Connections to RStudio. Users have to create a certificate (like LetsEncrypt) and use it with RStudio EC2 Instance with NGINX server. This creates complexity in the Certificate lifecycle. Complex for end-users to create, maintain and renew. The RStudio AMI also needs to manage the Certificate lifecycle. Move from External certificates to AWS ACM.

    Bring in a shared AWS ALB (Application Load Balancer) and use AWS ACM certificates for each Hosting Account to simplify the Certificate Management Lifecycle.
    2 SSL Secure Connection. Create an SSL connection with Nginx Server on RStudio EC2. Related to custom certificate management. Replaced with ALB at an Account level and shared by all RStudio Instances in an account. User Portal to ALB connection secured by ACM. For ALB to RStudio EC2 secure connection, use unique self-signed Certificates to encrypt connection per RStudio.
    3 Client Role (IAM) changes in SWB. Client role is provided necessary permissions for setup purposes. Additional role privileges added to handle ALB.
    4 ALB Design. Not existing in the original design. Shared ALB design per Hosting Account to be shared between Projects. Each ALB is expected to cost about $20-50 monthly in shared mode with average use. API model used to create/delete ALB.
    5 Route 53 Changes on the Main account. A CNAME record gets created with the EC2 DNS name. A CNAME record gets created with the ALB DNS name.
    6 RStudio AMI. Embedded with Certificate details. Related to custom certificate management. Independent of user-provided Certificate details. Also, AMI has been enhanced to include the following: Self-signed SSL and additional packages (as commonly requested by researchers) are baked into the AMI.
    7 RStudio Cloud Formation Template (CFT). Original one to be removed from SWB. Added a new output to indicate the “Need ALB” flag. Also, create a new target group to which the ALB can route requests.
    8 SWB Hosting Account Configuration. Did not have to provision certificate AWS ACM. Manual process to set up a certificate in a new hosting account.
    9 Provisioned RStudio per Hosting Account Active Count Tracking. None. Needed to ensure ALB is created the first time when RStudio is provisioned and deleted after the last RStudio is deleted to optimize cost overheads of ALB.
    10 SWB DynamoDB Table Changes. DynamoDB used for all Tables by SWB. Modifications needed to support the new design. Added to the existing DeploymentStore table in SWB design.
    11 SWB Provision Environment Workflow. Standard design. Additional Step added to check if “Workspace Type” needs ALB and if it does when checking for ALB and either create or pass the reference to existing one.
    12 SWB Terminate Environment Workflow. Standard design. Additional Step added to check if last Active RStudio being deleted and if so, also delete ALB to reduce idle costs.
    13 Secure “Connection” Action from SWB Portal to RStudio instance. To ensure each RStudio has a secure connection for each user a unique connection URL is generated during the user session that is valid for a limited period. The same design of the original implementation is preserved. Internally the routing is managed through ALB but the concept remains the same. This ensures users do not have to remember user id/password for RStudio and a secure connection is always made available.
    14 Secure “Connection” from SWB Portal disallowing other users from accessing RStudio resources given shared ALB. NA. Using the design feature (Step-13) ensures that even post ALB the connection for a User (Researcher and PI) is still restricted to their provisioned RStudio only and they cannot access other Researchers Instances. The unique connection is system generated using User to RStudio mapping uniquely.
    15 ALB Routing Rules for RStudio secure connections given shared nature. NA. Every time an RStudio is created or deleted, changes are made to ALB rules to allow a secure connection between the User session and the linked RStudio. The same rules are cleaned up during RStudio delete lifecycle. These changes to ALB routing rules are managed from SWB code under Workflow customizations. (Step-11 and 12) using APIs.
    16 RStudio Configuration parameters related to CIDR. Original design allows only whitelisted IP addresses to connect to associated RStudio instances – this can be modified also from configurations. RStudio Cloud Formation Template (CFT) should take Classless Inter-Domain Routing (CIDR) as Input Parameter and pass it through as an Output Parameter for the SWB to take it and create the ALB Listener Rule.
    SWB code will take CIDR from RStudio CFT output, subsequently, update the ALB Listener Rule with the respective Target Group.
    17 Researcher costs tracking. The original design had RStudio costs tracked for Researchers. Custom certificate costs were not tracked if any. In the new design, RStudio costs are tagged and tracked per researcher. ALB costs are treated as shared costs for the Hosting account.
    18 RStudio Packaging and Delivery for a new customer – Repository Model. Bundled with standard SWB repo and installed. New model for RL to create a separate Repo and host RStudio with associated documentation and templates for customers to use.
    19 RStudio Packaging and Delivery for a new customer – AWS Marketplace model. None. RL to provide RStudio on AWS Marketplace for SWB customers to add to standard Service Catalog and import (Future Roadmap item).
    20 Upgrade and Support Models for RStudio. SWB teams ownership. To be managed by RL teams.
    21 UI Modification for Partner Provided Products. No partner provided products. Partner-provided products will reside in the self-hosted repo. SWB UI will provide a mechanism to show details of partner names and a link to additional information.


    The diagram below explains the interplay between different design components.


    Secure and Scalable Solution Architecture
    Keeping in mind the above design goals, a secure and scalable architecture is implemented that solves the problem of shared groups using products like RStudio requiring secure HTTPS access without the overheads of individual certificate management. The architecture also enables sharing the same concept for all future researcher products with similar needs without any additional implementation overheads resulting in increased productivity and lower costs.


    The Relevance Lab team designed a solution centered on an EC2 Linux instance with RStudio and relevant packages pre-installed and delivered as an AMI.

    1. When the instance is provisioned, it is brought up without a public IP address.
    2. All traffic to this instance is delivered via an Application Load Balancer (ALB). The ALB is shared across multiple RStudio instances within the same account to spread the cost over a larger number of users.
    3. The ALB serves over an SSL link secured with an Amazon-issued certificate which is maintained by AWS Certificate Manager.
    4. The ALB costs are further brought down by provisioning it on demand when the first RStudio instance is provisioned. Conversely, the ALB is de-provisioned when the last RStudio instance is de-provisioned.
    5. Traffic between the ALB and the RStudio instance is also secured with an SSL certificate which is self-signed but unique to each instance.
    6. The ALB listener rules enforce the IP allowed list configured by the user.

    Conclusion
    Both SWB and Relevance Lab RLCatalyst Research Gateway teams are committed to making scientific research frictionless for researchers. With a shared goal, this new initiative speeds up collaboration and will help provide new innovative open-source solutions leveraging Service Workbench on AWS and partner-provided solutions like this RStudio with ALB from Relevance Lab. The collaboration efforts will soon be adding more solutions covering Genomic Pipeline Orchestration with Nextflow, use of HPC Parallel Cluster, and secure research workspaces with AppStream 2.0, so stay tuned.

    To get started with RStudio on SWB provided by Relevance Lab use the following link:
    Relevance Lab Github Repository for SWB Templates

    For more information, feel free to contact marketing@relevancelab.com.

    References
    Service Workbench on AWS for driving Scientific Research
    Service Workbench on AWS Documentation
    Service Workbench on AWS Github Repository
    RStudio Secure Architecture Patterns
    Relevance Lab Research Gateway



    0

    2021 Blog, AppInsights Blog, Blog, Featured

    Many AWS customers either integrate ServiceNow into their existing AWS services or set up both ServiceNow and AWS services for simultaneous use. Customers need a near real-time view of their infrastructure and applications spread across their distributed accounts.

    Commonly referred to as the “Dynamic Application Configuration Management Database (CMDB) or Dynamic Assets” view, it allows customers to gain integrated visibility into their infrastructures to break down silos and facilitate better decision making. From an end-user perspective as well, there is a need for an “Application Centric” view rather than an “Infrastructure/Assets” view as better visibility ultimately enhances their experience.

    An “Application Centric” View provides the following insights.

    • Application master for the enterprise
    • Application linked infrastructure currently deployed and in use
    • Cost allocation at application levels (useful for chargebacks)
    • Current health, issues, and vulnerability with application context for better management
    • Better aligned with existing enterprise context of business units, projects, costs codes for budget planning and tracking

    Use Case benefits for ServiceNow customers
    Near real-time view of AWS applications & Infrastructure workloads across multiple AWS accounts in ServiceNow. Customer is enabling self-service for their Managed Service Provider (MSP) and their Developers to:

    • Maintain established ITSM policies & processes
    • Enforce Consistency
    • Ensure Compliance
    • Ensure Security
    • Eliminate IAM access to underlying services

    Use Case benefits for AWS customers
    Enabling application self-service for general & technical Users. The customer would like service owners (e.g. HR, Finance, Security & Facilities) to view AWS infrastructure-enabled applications via self-service while ensuring:

    • Compliance
    • Security
    • Reduce application onboarding time
    • Optical consistency across all businesses

    RLCatalyst AppInsights Solution – Built on AppRegistry
    Working closely with AWS partnership groups in addressing the key needs of customers, RLCatalyst AppInsights Solution provides a “Dynamic CMDB” solution that is Application Centric with the following highlights:

    • Built on “AWS AppRegistry” and tightly integrated with AWS products
    • Combines information from the following Data Sources:
      • AWS AppRegistry
      • AWS Accounts
        • Design time Data (Definitions – Resources, Templates, Costs, Health, etc.)
        • Run time Data (Dynamic Information – Resources, Templates, Costs, Health, etc.)
      • AppInsights Additional Functionality
        • Service Registry Insights
        • Aggregated Data (Lake) with Dynamic CMDB/Asset View
        • UI Interaction Engine with appropriate backend logic


    A well-defined Dynamic Application CMDB is mandatory in cloud infrastructure to track assets effectively and serves as the basis for effective Governance360.

    To learn more about RLCatalyst AppInsights Solution Build on AWS AppRegistry click here.

    AWS recently released a new feature called AppRegistry to help customers natively build an AWS resources inventory that has insights into uses across applications. AWS Service Catalog AppRegistry allows creating a repository of your applications and associated resources. Customers can define and manage their application metadata. This allows understanding the context of their applications and resources across their environments. These capabilities enable enterprise stakeholders to obtain the information they require for informed strategic and tactical decisions about cloud resources. Using AppRegisty as a base product, we have created a Dynamic Application CMDB solution AppInsights to benefit AWS and ServiceNow customers as explained in the figure below.



    Modeling a common customer use case
    Most customers have multiple applications deployed in different regions constituting sub-applications, underlying web services, and related infrastructure as explained in the figure below. The dynamic nature of cloud assets and automated provisioning with Infrastructure as a Code makes the discovery process and keeping CMDB up to date a non-trivial problem.



    As explained above, a typical customer setup would consist of different business units deploying applications in different market regions across a complex and hybrid infrastructure. Most existing CMDB applications provide a static assets view that is incomplete and not well aligned to growing needs for real-time application-centric analysis, costs allocation, and application health insights. This problem has been solved by the AppInsights solution leveraging existing investments of customers on ITSM licenses of ServiceNow and pre-existing solutions from AWS like ServiceManagement connector that are available for no additional costs. The missing piece till recently was an Application-centric meta data linking applications to infrastructure templates.

    Customers need to be able to see the information across their AWS accounts with details of Application, Infrastructure, and Costs in a simple and elegant manner, as shown below. The basic KPIs tracked in the dashboard are following:

    • Dashboard per AWS Account provided (later aggregated information across accounts to be also added)
    • Ability to track an Application View with Active Application Instances, AWS Active Resources and Associated Costs
    • Trend Charts for Application, Infrastructure and Cost Details
    • Drill-down ability to view all applications and associated active instances what are updated dynamically using a period sync option or on-demand use based

    The ability to get a Dynamic Application CMDB is possible by leveraging the AWS Well Architected best practices of “Infrastructure as a Code” relying on AWS Service Catalog, AWS Service Management Connector, AWS CloudFormation Templates, AWS Costs & Budgets, AWS AppRegistry. The application is built as a scoped application inside ServiceNow and leverages the standard ITSM licenses making it easy for customers to adopt and share this solution to business users without the need for having AWS Console access.



    Workflow steps for adoption of RLCatalyst AppInsights are explained below. The solution provided is based on standard AWS and ServiceNow products commonly use in enterprises and build on existing best practices, processes and collaboration models.


    Step-1 Define AppRegistry Data Use AppRegistry
    Step-2 Link App to Infra Templates – CloudFormation Template (CFT) / Service Catalog (SC) AWS Accounts Asset Definitions
    Step-3 Ensure all Assets Provisioned have App and Service Tagging (Enforce with Guard Rails) AWS Accounts Asset Runtime Data
    Step-4 Register Application Services – Service Registry Service Registry
    Step-5 AppInsights Data Lake refresh with static and dynamic Updates (Aggregated across accounts) RLCatalyst AppInsights
    Step-6 Asset, Cost, Health Dashboard RLCatalyst AppInsights


    A typical implementation of RLCatalyst AppInsights can be rolled out for a new customer in 4-6 weeks and can provide significant business benefits for multiple groups enabling better Operations support, Self-service requests, application specific diagnostics, asset usage and cost management. The base solution is built on a flexible architecture allowing for more advanced customization to extend with real time health and vulnerability mappings and achieve AIOps maturity. In future there are plans to extend the Application Centric views to cover more granular “Services” tracking for support of Microservice architectures, container based deployments and integration with other PaaS/SaaS based Service integrations.

    Summary
    Cloud-based dynamic assets create great flexibility but add complexity for near real-time asset and CMDB tracking. While existing solutions using Discovery tools and Service Management connectors provided a partial solution to an Infrastructure centric view of CMDB, a robust Application Centric Dynamic CMDB was a missing solution that is now addressed with RLCatalyst AppInsights built on AppRegistry as explained in the above blog.

    For more information, feel free to contact marketing@relevancelab.com

    References
    Governance360 – Are you using your AWS Cloud “The Right Way”
    ServiceNow CMDB
    Increase application visibility and governance using AWS Service Catalog AppRegistry
    AWS Security Governance for Enterprises “The Right Way”
    Configuration Management in Cloud Environments



    0

    2021 Blog, BOTs Blog, Blog, Featured

    In large enterprises with complex systems, covering new generation cloud-based platforms while continuing with stable legacy back-end infrastructure usually results in high-friction points at integration layers. These incompatible systems can also slow down enterprise automation efforts to free up humans and have BOTs take over repetitive tasks. Now with RLCatalyst BOTs Server and leveraging common platforms of ServiceNow and UiPath, we provide an intelligent and scalable solution that can also cover legacy infrastructure like AS/400 with terminal interfaces. These applications are commonly found in the supply chain, logistics, warehousing enterprise domains supporting needs for temporary/flexi-staff onboarding & offboarding needs based on the volume of transactions in industries that see spikes in demand across special events powering the need for more automation first solutions.

    Integration of a cloud-based ticketing system with a terminal-based system would always require a support engineer, especially with labor-intensive industries. This is true for any legacy system that does not provide an external API for integration. There are diverse issues that occur slowing down business without compromising on security and governance aspects related to such workflows.

    With the lack of a stable API system to interface with the AS/400 legacy system, we decided to rely on BOTs simulating the same User behavior as humans dealing with terminal interfaces. RLCatalyst BOTs was extensively used as an IT Automation solutioning platform for ServiceNow, and the same concept was extended to interact with Terminal interfaces commonly used in Robotic Process Automation (RPA) use cases with UiPath. RLCatalyst acts as in “Automation Service Bus” and manages the integration between ServiceNow ITSM platform and UiPath Terminal Interface engine. The solution is extendable and can be used to solve other common problems especially bringing integration between IT and business systems.



    Using UiPath to automate processes in legacy systems
    Leveraging the capabilities of UiPath to automate terminal-based legacy systems, RLCatalyst interfaces with the service portal to get all the required information to help UiPath’s UiRobot to execute steps defined in the workflow. RLCatalyst’s BOT framework would provide the necessary tools to run/schedule BOT’s with governance, audit trails functionality.

    Case Study – Onboarding an AS400 system user
    The legacy AS400 system’s user onboarding process used to be multi-staged, with each stage representing a server with its ACL tables. A common profile name would be the link between the servers and, in some cases, independent logins required. A process definition document is the only governing document that helps a CS executive complete the onboarding process.

    The design used to automate the process was:

    • Build individual workflows for each stage in the User interaction processes using UiPath.
    • Build an RLCatalyst BOT which:
      • Refers to a template that includes a reference to the stages to run based on the type of user that needs to be onboarded.
      • Based on the template, it would maintain a document of profile names allotted.
      • Validates the profile availability (this helps onboarding outside the automation)
      • Executes the UiPath workflow for each stage in the sequence defined in the template.
      • Once the execution is complete, a summary of the execution with user login details is sent back to the ITSM system.
      • Logs for each stage are maintained for analysis and error corrections.
    • Build the Service Portal approval workflow, which would finally create a task for the automation process for fulfillment.
      • The service portal form captures all the necessary information for onboarding a new user.
      • Based on the account template selected that depicts a work department, a template reference is captured and included in the submitted form.
      • The service portal is used by the SOX compliance team to trace approval and provisioning details.
      • The process trail becomes critical during off-boarding to confirm, access revocation has occurred without delay.

    Advantages of Using UI Automation Over Standard API

    • Some of the AS400 servers used for Invoicing/Billing are over 20 years old, and the processes are as old as the servers themselves. The challenge multiplies when the application code is only understood by a small set of IT professionals.
    • UI automation eliminates system testing costs, since it just mimics a user. All user flows would have already been tested.
    • The time taken to build an end-to-end automation would be significantly lesser compared to getting a highly demanding IT professional building the API interface to get it automated.
    • Total automation investment would also significantly be reduced, and ROI’s would be quicker.

    Getting Started
    Based on our previous experience in integrating and automating processes, our pre-built libraries and BOTs should provide a head start to your automation needs. The framework would ensure that it meets all the necessary security and compliance needs.

    For more details, please feel free to reach out to marketing@relevancelab.com.



    0

    2021 Blog, Blog, Featured

    AWS Marketplace is a high-potential delivery mechanism for the delivery of software and professional services. The main benefit for customers is that they get a single bill from AWS for all their Infrastructure and Software consumption. Also, since AWS is already on the approved vendor list for many enterprises, it makes it easier for enterprises to consume software also from the same vendor.

    Relevance Lab has always considered AWS Marketplace as one of the important channels for the distribution of its software products. In 2020 we had listed our RLCatalyst 4.3.2 BOTs Server product on the AWS Marketplace as an AMI-based product that a customer could download and run in their AWS account. This year, RLCatalyst Research Gateway was listed on the AWS Marketplace as a Software as a Service (SaaS) product.

    This blog details some of the steps that a customer needs to go through to consume this product from the AWS Marketplace.


    Step 1: The first step for a customer looking to find the product is to log in to their account and visit the AWS Marketplace. Then search for RLCatalyst Research Gateway. This will show the Research Gateway product at the top of the list in the results. Click on the link and this should lead to the details page.

    The product details page lists the important details like.

    • Pricing information
    • Support information
    • Set up instructions

    Step-2: The second step for the user is to subscribe to the product by clicking on the “Continue to Subscribe” button. This step will need the user to login into their AWS account (if not done earlier). The page which comes up will show the contract options that the user can choose. RLCatalyst Research Gateway (SaaS) offers three tiers for subscription.

    • Small tier (1-10 users)
    • Medium tier (11-25 users)
    • Large tier (unlimited users)

    Also, the customer has the option of choosing a monthly contract or an annual contract. The monthly contract is good for customers who want to try the product or for those customers who would like a budget outflow that is spread over the year rather than a lump sum. The annual contract is good for customers who are already committed to using the product in the long term. An annual contract gets the customer an additional discount over the monthly price.

    The customer also has to choose whether they want to contract to renew automatically or not.

    One of the great features of AWS Marketplace is that the customer can modify the contract at any time and upgrade to a higher plan (e.g. Small tier to Medium or Large tier). The customer can also modify the contract to opt for auto-renewal at any time.

    Step-3: The third step for the user is to click on the “Subscribe” button after choosing their contract options. This leads the user to the registration page where they can set up their RLCatalyst Research Gateway account.



    This screen is meant for the Administrator persona to enter the details for the organization. Once the user enters the details, agrees to the End User License Agreement (EULA), and clicks on the Sign-up button, the process for provisioning the account is set in motion. The user should get an acknowledgment email within 12 hours and an email verification email within 24 hours.

    Step-4: The user should verify their email account by clicking on the verification link in the email they receive from RLCatalyst Research Gateway.

    Step-5: Finally, the user will get a “Welcome” email with the details of their account including the custom URL for logging into his RLCatalyst Research Gateway account. The user is now ready to login into the portal. On logging in to the portal the user will see a Welcome screen.


    Step-6: The user can now set up their first Organizational Unit in the RLCatalyst Research Gateway portal by following these steps.

    6.1 Navigate to settings from the menu at the top right.


    6.2 Click on the “Add New” button to add an AWS account.


    6.3 Enter the details of the AWS account.


    Note that the account name given in this screen is any name that will help the Administrator to remember which OU and project this account is meant for.

    6.4 The Administrator can repeat the procedure to add more than one project (consumption) account.

    Step-7: Next the Administrator needs to add Principal Investigator users to the account. For this, he should contact the support team either by email (rlc.support@relevancelab.com) or by visiting the support portal (https://serviceone.relevancelab.com).

    Step-8: The final step to set up an OU is to click on the “Add New” button on the Organizations page.


    8.1 The Administrator should give a friendly name to the Organization in the “Organization Name” field. Then he should choose all the Accounts that will be consumed by projects in this account. A friendly description should be entered in the “Organization Description” field. Finally, choose a Principal Investigator who will manage/own this Organization Unit. Click “Add Organization” to add this OU.


    Summary
    As you can see above, ordering RLCatalyst Research Gateway (SaaS) from the AWS Marketplace makes it extremely easy for the user to get started, and end-users can start using the product within no time. Given the SaaS model, the customer does not need to worry about setting up the software in their account. At the same time, using their AWS account for the projects gives them complete transparency into the budget consumption.
    In our next blog, we will provide step by step details of adding organizational units, projects & users to complete the next part of setup.

    To learn more about AWS Marketplace installation click here.

    If you want to learn more about the product or book a live demo, feel free to contact marketing@relevancelab.com.



    0

    2021 Blog, Blog, Featured

    Working on non-scientific tasks such as setting up instances, installing software libraries, making model compile, and preparing input data are some of the biggest pain points for atmospheric scientists or any scientist for that matter. It’s challenging for scientists as it requires them to have strong technical skills deviating them from their core areas of analysis & research data compilation. Further adding to this, some of these tasks require high-performance computation, complicated software, and large data. Lastly, researchers need a real-time view of their actual spending as research projects are often budget-bound. Relevance Lab help researchers “focus on science and not servers” in partnership with AWS leveraging the RLCatalyst Research Gateway (RG) product.

    Why RLCatalyst Research Gateway?
    Speeding up scientific research using AWS cloud is a growing trend towards achieving “Research as a Service”. However, the adoption of AWS Cloud can be challenging for Researchers with surprises on costs, security, governance, and right architectures. Similarly, Principal Investigators can have a challenging time managing the research program with collaboration, tracking, and control. Research Institutions will like to provide consistent and secure environments, standard approved products, and proper governance controls. The product was created to solve these common needs of Researchers, Principal Investigator and Research Institutions.


    • Available on AWS Marketplace and can be consumed in both SaaS as well as Enterprise mode
    • Provides a Self-Service Cloud Portal with the ability to manage the provisioning lifecycle of common research assets
    • Gives a real time visibility of the spend against the defined project budgets
    • The principal investigator has the ability to pause or stop the project in case the budget is exceeded till the new grant is approved

    In this blog, we explain how the product has been used to solve a common research problem of GEOS-Chem used for Earth Sciences. It covers a simple process that starts with access to large data sets on public S3 buckets, creation of an on-demand compute instance with the application loaded, copying the latest data for analysis, running the analysis, storing the output data, analyzing the same using specialized AI/ML tools and then deleting the instances. This is a common scenario faced by researchers daily, and the product demonstrates a simple Self-Service frictionless capability to achieve this with tight controls on cost and compliance.

    GEOS-Chem enables simulations of atmospheric composition on local to global scales. It can be used off-line as a 3-D chemical transport model driven by assimilated meteorological observations from the Goddard Earth Observing System (GEOS) of the NASA Global Modeling Assimilation Office (GMAO). The figure below shows the basic construct on GEOS-Chem input and output analysis.



    Being a common use case, there is documentation available in the public domain by researchers on how to run GEOS-Chem on AWS Cloud. The product makes the process simpler using a Self-Service Cloud portal. To know more about similar use cases and advanced computing options, refer to AWS HPC for Scientific Research.



    Steps for GEOS-Chem Research Workflow on AWS Cloud
    Prerequisites for researcher before starting data analysis.

    • A valid AWS account and an access to the RG portal
    • A publicly accessible S3 bucket with large Research Data sets accessible
    • Create an additional EBS volume for your ongoing operational research work. (For occasional usage, it is recommended to upload the snapshot in S3 for better cost management.)
    • A pre-provisioned SageMaker Jupyter notebook to analyze output data

    Once done, below are the steps to execute this use case.

    • Login to the RG Portal and select the GEOS-Chem project
    • Launch an EC2 instance with GEOS-Chem AMI
    • Login to EC2 using SSH and configure AWS CLI
    • Connect to a public S3 bucket from AWS CLI to list NASA-NEX data
    • Run the simulation and copy the output data to a local S3 bucket
    • Link the local S3 bucket to AWS SageMaker instance and launch a Jupyter notebook for analysis of the output data
    • Once done, terminate the EC2 instance and check for the cost spent on the use case
    • All costs related to GEOS-Chem project and researcher consumption are tracked automatically

    Sample Output Analysis
    Once you run the output files on the Jupyter notebook, it does the compilation and provides output data in a visual format, as shown in the sample below. The researcher can then create a snapshot and upload it to S3 and terminate the EC2 instance (without deleting the additional EBS volume created along with EC2).

    Output to analyze loss rate and Air mass of Hydroxide pertaining to Atmospheric Science.


    Summary
    Scientific computing can take advantage of cloud computing to speed up research, scale-up computing needs almost instantaneously, and do all this with much better cost-efficiency. Researchers no longer need to worry about the expertise required to set up the infrastructure in AWS as they can leave this to tools like RLCatalyst Research Gateway, thus compressing the time it takes to complete their research computing tasks.

    The steps demonstrated in this blog can be easily replicated for similar other research domains. Also, it can be used to onboard new researchers with pre-built solution stacks provided in an easy to consume option. RLCatalyst Research Gateway is available in SaaS mode from AWS Marketplace and research institutions can continue to use their existing AWS account to configure and enable the solution for more effective Scientific Research governance.

    To learn more about GEOS-Chem use cases, click here.

    If you want to learn more about the product or book a live demo, feel free to contact marketing@relevancelab.com.

    References
    Enabling Immediate Access to Earth Science Models through Cloud Computing: Application to the GEOS-Chem Model
    Enabling High‐Performance Cloud Computing for Earth Science Modeling on Over a Thousand Cores: Application to the GEOS‐Chem Atmospheric Chemistry Model



    0

    PREVIOUS POSTSPage 5 of 13NEXT POSTS