Blog post

Five best practices when setting up a clinical data fabric (CDF)

A clinical data fabric (CDF) is a type of data architecture that allows Pharma companies, MedTechs, and Research Laboratories to integrate, manage, combine, aggregate, visualize, and analyze clinical data from multiple sources in a flexible, scalable and future-proof manner. CDFs provide a unique access point to a comprehensive view of patient and operational health data. These may include clinical trial data, administrative and medical claims data, clinical laboratory data, electronic health records (EHRs), as well as data from wearable devices.

But how do we set up a clinical data fabric? In this blog post, we cover five best practices to keep in mind to start developing a CDF.

1. Define clear objectives

Setting clear objectives helps guide the design, implementation, and use of the CDF. And eventually, it ensures that the CDF delivers value to the organization. The objectives can be numerous and diverse.

Clinical laboratories need data management capabilities to automate data transfers with their clients and thereby reduce turnaround times for test results to accelerate patient care. A CDF can nicely integrate data from various laboratory instruments and laboratory information management systems, to support flexible data exchanges and to provide real-time data analytics, used e.g., to identify bottlenecks in the processes or to analyze the prevalence of diseases in a given population.

For a Pharma sponsor, a CDF can gather real world evidence (RWE) data coming from different sources such as patient communities, laboratories, hospitals and past studies data to optimize protocol design and site selection. Some organizations could also leverage prescription data from reimbursement systems; delivery and shipment data to make sure the drugs are available at the right site, in the right quantity and at right time; or manufacturing data about production lines and packaging chains. Finally, clinical data managers can use a CDF to combine and integrate the increasing number of sources (electronic data capture systems, mobile applications and wearable devices, hospitals & laboratories, etc.) of patient data collected during a clinical trial.

For Research Laboratories and MedTech companies, a CDF can gather information about a research topic which often encompasses a large variety of data: scientific literature, chemical properties of molecules and their interactions (pathways), omics data, related clinical trial protocols, market and prevalence of diseases data, patent data, etc.

At first, a scoping study can help organizations define their needs. Oftentimes, it starts by analyzing the specific data objectives and requirements related to a data initiative, a concrete project, a small team or a department in the organization. Following this, a big picture can be created and cross-department synergies can be uncovered. Finally, each specific objective is decomposed into a set of needs which are then mapped to CDF components. Therefore, starting from clear objectives ensures that the selected CDF architecture is the best fit for purpose and that it can deliver the maximum value.

2. Analyze current data infrastructure

It is rather rare to start from scratch. Most organizations have already in place a data warehouse managed centrally or by one or several specific departments. Maybe a data catalog is already in place. Or maybe some relevant data have been already integrated into a database system. When initiating a data fabric approach, it is crucial to make an overview of these existing data infrastructures to promote the re-use of datasets and infrastructures. The data fabric is not necessarily there to replace existing infrastructures but to complement them and allow them to interoperate. A particular attention should be placed here on data management and governance rules already in place which must be transferred into the CDF, and possibly complemented, to make sure the organization meets all its regulatory and compliance requirements.

In addition, it is also an opportunity to identify the challenges and difficulties the organization is facing when dealing with existing data. Maybe, some important data sources are not yet integrated; the data transformation steps can be too slow; or some datasets cannot be combined to feed the reporting systems. Taking those challenges into account, the CDF can be designed to address them properly and therefore ensure that the relevant data is seamlessly available to meet all the needs.

3. Gather the right team

Setting up a clinical data fabric requires a diverse range of skills, and therefore cross department collaborations. Data experts (data architects, enterprise architects, IT experts and data scientists) can compose the most appropriate and evolutive technological portfolio to meet the organization's needs, and actually implement the data fabric. This portfolio may include: data integration, data management, data analytics, data visualization, natural language processing and predictive modelling capabilities. Besides, data experts will have to work with domain experts (clinical data managers, clinical trial managers, biostatisticians and bio-informaticians) who have a thorough knowledge of clinical workflows, clinical data standards and regulatory requirements. Together, they can create and develop specific dataflows to make clinical data available for other applications and help the organization fully leverage the value in its clinical data.

Moreover, implementing a clinical data fabric can be a complex project, requiring careful planning, budgeting, and management of resources. A team with project management skills and able to define the right architecture to be put in place can help ensure that the project stays on track and is completed on time and within budget. Similarly, change management skills can make sure that the organization is prepared for these changes, and that the implementation of the data fabric is managed in a way that minimizes disruption to existing workflows.

4. Adopt an iterative development method

Getting value from data is in essence an iterative process. The integration of new datasets allows for new use cases which themselves create new questions, whose answers will therefore require more data that need to be integrated. It is particularly true for drug development initiatives, since it is often difficult to determine precisely which data will be really valuable at the beginning of the project. Relevant public datasets are often integrated during the early phase and later on, complementary private datasets must be integrated while establishing data exchanges with research partners or healthcare providers. Moreover, laboratories have to maintain data exchanges with their evolving pool of clients and partners while adapting to new technologies to, for example, analyze patient biopsies. Besides, clinical trials require an ever-growing number of data applications (ie. ePRO, eCOA) and devices (ie. connected sensors), along with data management systems, to better collect patient information.

Furthermore, a clinical data fabric often starts small; for instance by integrating a data catalog solution connecting to existing data sources and providing an overview of all the data available. Then, from there, new opportunities can be uncovered, leading to the raise of new data initiatives that a constantly evolving clinical data fabric will support.

Finally, adopting an iterative development process brings flexibility and agility. It allows organizations to adapt to changing business and operational needs. By breaking the development process into smaller, more manageable chunks, organizations can better respond to user feedback and new requirements.

5. Foster a culture of data-driven decision making

The clinical data fabric is there to make data available to everyone in the organization, independently of their data literacy. It increases collaboration within teams and across departments by promoting the reuse of datasets. It supports and accelerates data preparation for reporting and machine learning tools. But these are not the process endpoints. Indeed, an organization also needs people and operational applications that can leverage all the insights generated and turn them into concrete decisions and actions that will boost its growth. Without a proper data driven-culture, the insights generated by a clinical data fabric are useless. It is therefore crucial to properly promote and advertise the use of the CDF within the organization using training sessions and demos, and make it evolve by constantly aligning its development to the needs of its users.

By applying these 5 best practices when developing a CDF architecture; clinical laboratories, pharma companies as well as smaller research organizations; can efficiently set up a CDF that delivers measurable value, supports business decisions, optimizes clinical trial operations and facilitates research and eventually improves patient outcomes.

Keyrus can help you shape your data fabric project during a short ideation session bringing different domain experts (clinical trial managers, clinical data managers, legal experts, data architects, etc.) around the same table.

To know more about the clinical data fabric architecture, download our white paper.