Our client is the leading IT development and consulting company serving multiple hospitals, academic medical centers, medical groups, clinics, and other healthcare institutions in the USA. The firm specializes in compensation, benefits, and human resources solutions for healthcare and not-for-profit organizations. With over 30 years of experience, they offer a comprehensive suite of services designed to help their clients attract, retain, and motivate top talent. Some of their key solutions include executive and physician compensation, employee benefits, workforce planning and analytics, and governance and regulatory compliance.
Our client required our expertise to mine comprehensive details of physicians operating within specific locations and specialties. To achieve this, the client provides us with a relevant list of hospital websites to serve as a primary source for gathering relevant information- location of practice, contact details of these medical practitioners, and other details. Additionally, our client has tasked us with identifying any gaps present in the collected data, conducting extensive web research to enrich and update the database, and ensuring it is comprehensive and up-to-date.
The requirement and volume of data to be extracted for the project keep on varying. To meet the dynamic demands of the client, we have deployed a team of six dedicated healthcare data professionals and 8-10 part-time resources (subject matter experts) for the project. We began by preparing a database in Excel and later gained access to their CRM, wherein we are performing the following tasks:
The client provides us with relevant sources to mine US-based physicians' data and extract relevant details. We have allotted a team of dedicated medical data professionals to mine data from specified sources (particularly hospital websites) within the required location, region, and specialties. For instance, finding family nurse physicians in Indiana state.
To scrape the relevant data, our team uses custom scripts and APIs, allowing us to extract various data points like practice location, phone number, gender, NPI numbers, specialties, and more from a variety of websites. For sites that are difficult to scrape automatically, we manually check web sources provided by the client, apply filters based on their specifications, and capture the necessary data. Once the data is collected, we consolidate it into a comprehensive database that our client uses for their own purposes.
Information collected using automated data scraping may not always be 100% accurate and hence its accuracy and reliability cannot be trusted. To address this issue, we have implemented a comprehensive data validation and quality review process.
Our professionals carefully verify and validate all information collected by automated scrapers, ensuring that the data extracted is both complete and accurate. This process includes a thorough review of all relevant data points collected with the data mining process.
As stated in the challenges, there are several instances where the client-provided website does not complete information about the physicians. To tackle this, we have assigned a team of specialized and experienced web data research experts with a wealth of experience in the healthcare sector. They diligently sift through various sources, including Google, directory websites, and social media platforms, to identify and gather relevant data. Our experts manually extract the missing information and enrich the database with complete and up-to-date information.
After collecting the desired information and finding the missing data, our custom list building experts worked diligently to create a custom, comprehensive, and up-to-date database of physicians, including their specialties, practice locations, phone numbers, and other relevant details within the client’s proprietary platform. Our team manually verified and compiled data from various sources and ensured that the custom list provided to the client was valid with accurate information.
The client signed a long-term contract with SunTec and is still working with us since 2016.
Our approach to data mining, leveraging a combination of automated and manual techniques, has proven to be highly effective, with the client experiencing a remarkable 5X increase in data acquisition speed.
Our diligent data verification and validation procedures have led to a substantial 35% boost in data accuracy for the client, enabling them to make informed decisions
Our human-in-the-loop approach acknowledges the limitations of automation and underscores the need for human expertise and intervention to ensure optimal results. This project exemplifies this approach, where we employed automated data mining and scraping techniques to gather the required data, but it relied heavily on human intervention at several stages. Manual data extraction played a vital role in the project's success, particularly when scraping techniques were insufficient. Moreover, manual data research and enrichment were necessary to access the relevant data on practitioners and physicians. Additionally, the data validation process involved our professionals verifying the accuracy and relevance of the collected data, enabling the client to access accurate, verified, and relevant healthcare data.