Algorithm development to estimate pregnancy data from Electronic Health Records

A Use Case from Argentina

Carolina Mengoni Goñalons

cmengoni@buenosaires.gob.ar

Buenos Aires City Government | University of Buenos Aires

Juliana Reves Szemere

jrevesszemere@buenosaires.gob.ar

Buenos Aires City Government | National University of San Martín | National Pedagogical University

María Cristina Nanton

m.nanton@buenosaires.gob.ar

Buenos Aires City Government | University of Buenos Aires

Information and Health Statistics Management Office

Ministry of Health of the City of Buenos Aires

Map of South America, zooming in the City of Buenos Aires.

¿Why does our office need to detect and estimate pregnancies and delivery dates?

Diagram showing information flow from health centers to information systeem to electronic health records

Free-text input by healthcare professionals
Rich, high-value data
Longitudinal pregnancy monitoring
Critical obstetric metrics

Background

Use of structured data to estimate pregnancy outcomes:

Capture of the journal article titled 'Development and evaluation of MADDIE: Method to Acquire Delivery Date Information from Electronic Health Records,' published in the International Journal of Medical Informatics, Volume 145, January 2021. Authors listed are Silvia P. Canelón, Heather H. Burris, Lisa D. Levine, and Mary Regina Boland.

Aim: Develop an algorithm that infers patient delivery dates (PDDs) and delivery-specific details from Electronic Health Records (EHRs) with high accuracy, enabling pregnancy-level outcome studies in women’s health.

Tools

R-based, open-source ecosystem for scalable health data processing

agiseR

is an internal R package developed by our team to facilitate access to local databases and automate reporting workflows.

Automated Detection of Pregnancies

Records and pregnancies

Diagram illustrating patient timelines. Each horizontal line represents an individual patient, with black-outlined circles representing health records and red segments indicating pregnancy intervals.

Automated Detection of Pregnancies

Extraction of Gestational Age (GA) values

Diagram showing a patient's timeline. Black-outlined circles represent electronic health records from which a value of Gestational Age could be extraced, while grey circles indicate no GA was obtained

This patient timeline shows health records that may or not contain information on gestational age, GA (black circles or grey circles).

Automated Detection of Pregnancies

Clustering of records

Diagram showing a patient's timeline. Grey outlined circles represent non-informative records. Coloured circles represent records refering to different pregnancies according to their colour

These patient’s health records belong to different pregnancies (differently coloured circles). Our algorithm identifies them by comparing differences in GA values and the time gap between records.

Automated Detection of Pregnancies

Delimiting of detected pregnancy

Diagram showing a patient's timeline. Coloured segments represent the reconstructed pregnancies based on the patients records, the result of the algorithm. Below, bold text indicates variables that characterise the pregnancy while regular text indicates operations performed

As a result of the algorithm, two pregnancies have been reconstructed based on the patients records (differently coloured segments). Below, bold text indicates variables that characterise the pregnancy while regular text indicates operations performed.

Automated Detection of Pregnancies

Updating

Flowchart illustrating how a new medical record is classified in relation to pregnancy detection. It begins with a new record and asks whether it matches someone with a detected pregnancy. If not, it's marked as a new pregnancy. If yes, it checks if the pregnancy is already known. If not, it's also a new pregnancy. If yes, and based on comparing changes in gestational age versus record date, the record is classified as new information about an ongoing pregnancy, indicating either end of pregnancy or higher gestational age.

Project’s impact

Available pregnancy patient information and efficient processing for secondary uses of health data.

Primary healthcare

Alert rules for specific conditions
Result indicators for appropriate checkups
Managing conditions alongside a pregnancy

Healthcare cost recovery

Generation of tracer indicators

Mandatory reportable diseases

Differentiate pregnant patients for Epidemiological Bulletin

Algorithm development to estimate pregnancy data from Electronic Health Records

Information and Health Statistics Management Office

Ministry of Health of the City of Buenos Aires

¿Why does our office need to detect and estimate pregnancies and delivery dates?

Background

Tools

R-based, open-source ecosystem for scalable health data processing

Automated Detection of Pregnancies

Records and pregnancies

Automated Detection of Pregnancies

Extraction of Gestational Age (GA) values

Automated Detection of Pregnancies

Clustering of records

Automated Detection of Pregnancies

Delimiting of detected pregnancy

Automated Detection of Pregnancies

Updating

Project’s impact

Available pregnancy patient information and efficient processing for secondary uses of health data.

Primary healthcare

Healthcare cost recovery

Mandatory reportable diseases

Thank you for your time!