FAIR data is a defining component for a future-proofed pharma or life science enterprise. But what is FAIR data, and how can data be made FAIR? We explain how knowledge graphs achieve FAIR data to accelerate discovery, overcome regulatory hurdles, and supercharge AI you can trust.
AI is transforming the pharma and life sciences industries and the market for AI in the pharmaceutical industry is projected to reach $13.1 billion by 2034. Within the drug development process alone, the application of AI is expected to shorten drug-to-market timelines from 5-7 years to as little as 12-18 months. The market is already awash with AI tools promising to revolutionize the industry – including OpenFold3, which helps researchers predict the 3D structure of proteins, and Chemistry42, a “generative chemistry” platform.
But FAIR data is fundamental for successful AI. If data isn’t findable, accessible, interoperable, and reusable, then the quality of AI outputs is compromised, and money and time spent on AI projects is wasted.
Knowledge graphs are conceptual models that visualize relationships between real-world objects and concepts. In this blog post, we pull key insights from the metaphacts video series Mind the Graph: Knowledge graphs in Pharma, featuring Peter Dörr, Director of PreSales at metaphacts (a Digital Science company), where he explores how enterprises in these industries are leveraging knowledge graphs to make their data FAIR and meet complex data needs across the whole value chain from R&D to clinical trials to manufacturing.
What is FAIR data and how do you achieve it?
“Knowledge graphs are there to make data FAIR.” But what exactly is FAIR data, how do knowledge graphs achieve FAIR data and why should FAIR data matter to industry leaders?
The FAIR principles are guidelines published in the Scientific Data journal, which emphasize machine-actionability. The acronym stands for Findability, Accessibility, Interoperability, and Reusability. These are guiding principles that are applicable to data architectures across all industries, and are especially beneficial for industries that are data-intensive and strict regulatory and compliance requirements.
Knowledge graphs make data findable because they standardize cataloging and provide a unified metadata layer. As Peter explains, it’s not just the knowledge graphs in of themselves that achieve FAIR data. When there is a semantic layer underpinning your knowledge graph, the semantic layer explicitly defines business objects, they bridge the semantic gap (where machines or humans interpret the meaning of an object differently, misinterpreting each other). This makes data accessible.
Often built on open standards, like RDF, OWL and SKOS, knowledge graphs help companies to create a machine-readable “digital twin” of their organization’s data landscape – meaning that data is rendered interoperable, and computer systems or software are able to easily exchange and make use of the information.
Finally, because knowledge graphs aren’t static, they can be easily repurposed and reused. Because knowledge graphs don’t include a fixed database schema, they are more like a living map of your data, which extends and changes as you add new insights. This is unlike traditional databases, where data is stored in rigid tables, and new questions require an upheaval of the existing data structure.
Imagine you build a graph to successfully map the Genomic Targets for a rare disease. When you want to map the Genomic Targets for a new disease, using the knowledge graph, you can simply layer in the clinical trial data for the new drug. This saves potentially months of work on data migration and schema redesign.
FAIR pharma data is AI-ready pharma data
“Un-FAIR data creates unsuccessful AI projects.” In the pharma and life sciences industry, unsuccessful AI projects create dead ends, and time-intensive applications of tools that fail to return meaningful ROI. Unfortunately common. But when clearly structured, machine-readable FAIR data is fed to AI, the output is high-quality and traceable.
Tools like ChatGPT and Gemini AI are already being used by many employees in the pharmaceutical industry. However, LLMs are not fully-trusted sources of information. Because there is no clear trail of where information was sourced, standalone LLMs provide untraceable answers. This “black box” effect means that even the creators of LLMs are unable to explain how their models arrive at their answers. In an environment where trust is key, using ungrounded AI is a dangerous game.
The consequences of misapplied AI in the pharma industry are especially severe. AI hallucinations can potentially result in life-or-death mistakes. In fact, a report by SwissRe predicted that in 2032-34, the health and pharma industry will be most at risk of AI misappropriation. Misappropriation of AI might include citing a non-existent clinical study during the drug development process or relying too heavily on the speed and convenience of AI outputs, and shirking necessary due diligence. For example, fast-tracking a drug to market despite having overlooked its potential for long-term toxicity.
FAIR data is designed to make data machine-readable and AI-actionable. Unlike generic LLMs, by grounding AI in knowledge graphs and FAIR data, the results are transparent and traceable. This detailed metadata provides the explainability needed to ensure pharma and life science companies can trust their AI outputs. One of the benefits of building knowledge graphs with metaphactory is not only that the results are AI-optimized, but that metaphactory is a frontrunner in utilizing AI to simplify building and querying the data.
Unlock insights and bridge-silos with FAIR Data
Although pharmaceutical companies share many of the same data challenges as any large organization, Peter isolates two main pain points in the industry. One is that pharma is a science-driven industry, and science creates a lot of data.
And disconnected data from different labs and publications not only slows down research and development, but data silos undermine efficiency across every operational stage.
For example, Peter illustrates, if you want to repurpose a drug and you fail to connect two relevant research papers, you miss a huge opportunity. Expensive approaches like migrating fragmented data to central repositories like data warehouses and lakes can end up being time-consuming and inflexible. Point-to-point integration is difficult to scale and both approaches ultimately fail to provide the holistic point of view necessary to truly mitigate data silos.
Because FAIR data uses standardized ontologies, data from different labs, systems and even across different geographic locations and languages can be integrated together harmoniously without great expense or time. Rather than wasting time searching for data, or missing out on data opportunities, or wasting time evaluating unnecessary information and resources already available within the organization, having a “digital twin” of the company’s data modeled in a semantic layer means that data is findable and usable, even to those with limited technical expertise.
One of the successes Boehringer Ingelheim, a German pharmaceutical company, achieved by using metaphactory is that researchers are now able to gain insights and make discoveries much faster than before. This is because their knowledge graphs now provide a holistic and navigational view of their data.
Not only do knowledge graphs help map internal knowledge, but life science and pharma companies also have the option of tapping their internal knowledge into the Dimensions Knowledge Graph.
The Dimensions Knowledge Graph captures 350 million semantically annotated and linked records of global research, and enables integrations with public datasets and ontologies.
Besides the exploration of public datasets, exchanging knowledge with external companies or even competitors can be mutually beneficial. Pre-competitive knowledge sharing in biotech and pharmaceuticals accelerates the discovery of solutions to shared problems. One example of this type of valuable collaboration is ICODA (International COVID-19 Data Alliance), a global initiative in response to the COVID-19 pandemic.
But as Peter explains, this could pose its own regulatory and data privacy concerns. In these scenarios, knowledge graphs enable collaborators to exchange only the required metadata, and make case-by-case access decisions. All of this can be visualized in individualized dashboards, which can be tailored to specific needs by asking natural language questions to generative AI.
Stress-free compliance with FAIR data and knowledge graphs
As we just touched on, another concern at the top of the pharmaceutical industry agenda is regulation. To keep patients safe and their medical data secure, regulatory requirements are high.
Companies are beginning to realize the benefit of drawing upon personal data from wearable devices, electronic health records and insurance claims to inform their decisions and monitor efficacy. But more personal data means greater responsibility and pressure to meet regulatory standards.
Although regulatory agencies encourage the use of real-world evidence (RWE), enterprises must ensure that this real-world data satisfies the stakeholders involved in regulatory policies and guidelines, including government agencies, NGOs, and health tech assessment agencies.
In one case, an unexpected, but simple request from a regulator to provide information on a single ingredient sent one pharmaceutical company on a long, labor-intensive journey to satisfy the request. Why did it take so long? Because they had un-FAIR data.
FAIR principles provide a complete, machine-readable audit trail of a company’s data. This simplifies regulatory processes and ensures that data meets industry standards. Taking an example from another company, one success of Boehringer Ingelheim’s knowledge graph architecture is that now, regulatory tasks are simplified as compliance can align internal direct product data with their EMA product database.
FAIR data and knowledge graphs provide the structured, flexible and comprehensive solution to manage the vast amount of data collected day to day in the pharma and life science sectors. Taking another real-life example, one Swiss healthcare company used metaphactory to build a FAIR in vivo data sharing platform. This allowed researchers, bioinformaticians, and lab scientists to browse, search, access, and extract meaningful insights obtained during preclinical studies whilst also preparing the data to meet regulatory submissions.
AI-integration, silo mitigation and demanding regulation, simplified with knowledge graphs
The global pharmaceutical market is expected to reach 3.5 billion by 2035. But the companies that will claim the most from this growth won’t be the ones that throw the most money at AI, but the ones with the most reliable, FAIR data.
More and more companies, like Boehringer Ingelheim, that are wise to this reality have built ontologies with the help of tools like metaphactory, and already reap the rewards of this technology. Meanwhile, competitors flounder in un-FAIR, AI-incompatible data landscapes.
In this blog, we’ve explored why FAIR (Findable, Accessible, Interoperable, Reusable) is the linchpin of any forward-looking data strategy. From accelerating discovery and uncovering hidden insights, to creating a machine-readable audit trail of data and closing data silos and bridging semantic gaps, knowledge graphs make data FAIR, and FAIR data is future-proofed.
Learn more about Digital Science’s data solutions for pharma and life science enterprises
AI-integration, mitigating data silos and satisfying regulatory requirements are just three easy wins of introducing knowledge graph architecture into your pharma or life sciences enterprise.
But this is just the start of what knowledge graph technology can help enterprises achieve. Since 2010, Digital Science has been working with organizations, including life science and pharmaceutical enterprises, to create tailored tools to foster innovation and collaboration.
Digital Science has developed and refined solutions that super-charge the whole research lifecycle, whether safeguarding research programs, enhancing decision making, or showcasing the impact of research.
You can browse the full range of AI-enhanced tools here, or read more about how these tools are already being applied in pharma on our blog.
Watch clips from the Mind the Graph: Knowledge graphs in pharma video series here.
The post Why the future of Pharma data can only be FAIR appeared first on Digital Science.
from Digital Science https://ift.tt/XJqb5p6
No comments:
Post a Comment