Solving the Major Challenges of Data Integration & Analytics for Translational Medicine Applications

Translational Medicine is an area of research in the life sciences that focuses on bringing together early stage research and development with downstream clinical outcomes to better understand the effects of drugs and therapies for real-world patient outcomes. This area has been considered a “holy grail” of the life sciences for decades now, because solving it effectively means work done at the workbench (in laboratories) can be more easily connected to clinical delivery of new medicines and vice versa. It, in essence, more thoroughly connects the upstream and downstream activities that go into making medicines involving less trial and error and which are designed more around the use of real-world data and evidence-based patterns of information.

To achieve these outcomes, it is important to be able to utilize a plethora of data from different research and clinical areas. These data sources are normally spread across organizations, exist in incompatible formats, and are often inconsistently labeled. This means that there is a wide variety of data sources required for translational medicine to be effective. Traditionally, software development in this area meant that large amounts of this heterogeneous data needed to be migrated together, transformed and mapped into a common repository such that complex search and analytics could be performed. The first stage of the process, therefore, normally includes building a pipeline for the data to be extracted, transformed and loaded (ETL’d) into the new system – a process which can take months, or even years, to complete. Once built, these pipelines are often rather rigid and brittle, in that they connect data together in specific ways for specific purposes. So when the research questions change, often the ETL pipeline must be rebuilt or extended over and over. It is hard to have a flexible and ad hoc type of system that can keep up with the scientific demands of translational medicine where new topics and patterns of interest are constantly being thought of as potential research avenues.

Pic Landing Page LA (2)

At LeapAnalysis (LA), we have built a new type of technology for use in translational medicine, one which allows for ad hoc queries and analytics to be run by scientists directly. By removing the need for ETL pipelines and removing the need for data to physically move or be copied, LeapAnalysis provides a straight-forward means to access data directly from the source via intelligent data connectors. These connectors are driven by semantic metadata (data models representing the most basic classes, attributes and relationships in the data) as well as machine learning (which scans, reads and presents the original data schemas directly to LA’s engine). Users can connect directly to data sources, via a cloud-based portal, regardless of their physical location. LA provides nearly immediate access to a wide variety of sources in a plethora of formats dramatically reducing the time needed for other search and analytics systems that require the expense of building data pipelines and complicated mapping strategies.

LeapAnalysis is changing the way that translational medicine gets done in the following ways:

  • LA dramatically decreases the time needed to find and integrate data:
    weeks/months become minutes/seconds
  • LA increases data availability, accuracy and overall quality by a significant margin through data enrichment - both semantic and statistical enrichment
  • LA improves search and analytics results because complex questions or patterns that require a wide range of data types (flat files, relational data, text data, image data, graph data, open source data, etc.) can be brought together on the fly right when the user needs it
  • LA cuts time for training on new systems, because users can not only use LA interfaces, but also can use their favorite existing BI or analytics dashboards. LA acts as middleware and serves up the data on the backend of these other tools.

LeapAnalysis was recently successfully deployed at a large pharmaceutical customer within their enterprise IT environment, where it showed its ability to perform across several sources and provide search and analytics at high speeds – even though the data remained disconnected and was only connected virtually via metadata.