Record linkage under suboptimal conditions for data-intensive evaluation of primary care in Rio de Janeiro, Brazil

dc.contributor.authorCoeli, Claudia Medina
dc.contributor.authorSaraceni, Valeria
dc.contributor.authorMedeiros Jr., Paulo Mota
dc.contributor.authorSantos, Helena Pereira da Silva
dc.contributor.authorGuillen, Luis Carlos Torres
dc.contributor.authorAlves, Luís Guilherme Santos Buteri
dc.contributor.authorHone, Thomas
dc.contributor.authorMillett, Christopher
dc.contributor.authorTrajman, Anete
dc.contributor.authorDurovni, Betina
dc.date.accessioned2024-12-11T17:25:50Z
dc.date.available2024-12-11T17:25:50Z
dc.date.issued2021
dc.description.abstractBackground: Linking Brazilian databases demands the development of algorithms and processes to deal with various challenges including the large size of the databases, the low number and poor quality of personal identifers available to be compared (national security number not mandatory), and some characteristics of Brazilian names that make the linkage process prone to errors. This study aims to describe and evaluate the quality of the processes used to create an individual-linked database for data-intensive research on the impacts on health indicators of the expan- sion of primary care in Rio de Janeiro City, Brazil. Methods: We created an individual-level dataset linking social benefts recipients, primary health care, hospital admission and mortality data. The databases were pre-processed, and we adopted a multiple approach strategy combining deterministic and probabilistic record linkage techniques, and an extensive clerical review of the potential matches. Relying on manual review as the gold standard, we estimated the false match (false-positive) proportion of each approach (deterministic, probabilistic, clerical review) and the missed match proportion (false-negative) of the clerical review approach. To assess the sensitivity (recall) to identifying social benefts recipients’ deaths, we used their vital status registered on the primary care database as the gold standard. Results: In all linkage processes, the deterministic approach identifed most of the matches. However, the propor- tion of matches identifed in each approach varied. The false match proportion was around 1% or less in almost all approaches. The missed match proportion in the clerical review approach of all linkage processes were under 3%. We estimated a recall of 93.6% (95% CI 92.8–94.3) for the linkage between social benefts recipients and mortality data. Conclusion: The adoption of a linkage strategy combining pre-processing routines, deterministic, and probabilistic strategies, as well as an extensive clerical review approach minimized linkage errors in the context of suboptimal data quality.
dc.identifier.citationCoeli CM, Saraceni V, Medeiros PM Jr, da Silva Santos HP, Guillen LCT, Alves LGSB, Hone T, Millett C, Trajman A, Durovni B. Record linkage under suboptimal conditions for data-intensive evaluation of primary care in Rio de Janeiro, Brazil. BMC Med Inform Decis Mak. 2021 Jun 15;21(1):190. doi: 10.1186/s12911-021-01550-6.
dc.identifier.otherDOI: 10.1186/s12911-021-01550-6
dc.identifier.urihttps://dspace.inc.saude.gov.br/handle/123456789/698
dc.language.isoen
dc.publisherBMC Medical Informatics and Decision Making
dc.subjectMedical record linkageen
dc.subjectData accuracyen
dc.subjectBrazilen
dc.subjectPrimary healthcare.en
dc.titleRecord linkage under suboptimal conditions for data-intensive evaluation of primary care in Rio de Janeiro, Brazil
dc.typeArticle
Arquivos
Original bundle
Agora exibindo 1 - 1 de 1
thumbnail.default.alt
Nome:
Coeli CM et al_BMC Med Inform Decis Mak.pdf
Tamanho:
257.22 KB
Formato:
Adobe Portable Document Format
Descrição:
License bundle
Agora exibindo 1 - 1 de 1
thumbnail.default.placeholder
Nome:
license.txt
Tamanho:
1.71 KB
Formato:
Item-specific license agreed to upon submission
Descrição: