Istanbul's municipal digital archive is sitting on a problem it can no longer ignore. Across multiple city-run databases — including those maintained by the Istanbul Metropolitan Municipality's cultural directorate and the Istanbul Archaeological Museums' digitisation unit — tens of thousands of duplicate image files have accumulated over years of overlapping scanning projects, departmental handoffs, and emergency digitisation drives carried out after the February 2023 Kahramanmaraş earthquakes reminded the entire country how quickly physical collections can be lost.
The duplication crisis matters now for a specific reason: the Istanbul Metropolitan Municipality launched its flagship Open City Archive platform in March 2025, promising unified public access to photographs spanning the late Ottoman period through the early Republic. Eighteen months in, the platform's search function returns multiple near-identical results for landmark queries — search for Galata Tower restoration photographs from the 1960s, for instance, and users routinely encounter six or seven versions of the same image filed under different catalogue numbers, different departments, and sometimes contradictory metadata. The clutter is not cosmetic. It erodes trust in the archive as a research tool and inflates storage costs that the municipality is already under pressure to justify amid ongoing lira-driven budget constraints.
What the Institutions Are Weighing
The core decision facing both the metropolitan municipality and the Istanbul Archaeological Museums — which jointly contributed holdings to the Open City Archive — is whether to pursue automated deduplication using AI-assisted image recognition software, or to invest in a slower, human-curated review process. Neither option is cheap, and neither is risk-free.
Automated deduplication tools have been piloted in comparable archival projects elsewhere in Europe, including by the Bibliothèque nationale de France, which began a large-scale image deduplication programme for its Gallica platform in 2022. The French experience showed that algorithmic tools achieved accuracy rates of roughly 94 percent on photographic material but struggled with historically significant variants — two prints of the same negative made at different times, for example, can carry distinct archival value and should not be merged or deleted. Istanbul's archivists face the same dilemma, particularly for collections covering Beyoğlu and Karaköy neighbourhoods, where the built environment changed rapidly across the mid-twentieth century and sequential photographs of the same street carry documentary weight that an algorithm cannot easily assess.
A human-curated review, meanwhile, would require dedicated staffing. The municipality's digitisation budget for 2026 — set before the latest round of inflation adjustments — had already allocated funds toward metadata standardisation rather than deduplication specifically, meaning any serious curation programme would require either a budget amendment or a reallocation away from other planned work, including an ongoing project to digitise neighbourhood registry documents held at the Fatih district archive on Macar Kardeşler Caddesi.
The Decisions Ahead and Who Makes Them
Three distinct choices now need to be made, and they need to be made in sequence. First, the Istanbul Metropolitan Municipality's digital infrastructure committee must decide by the end of the third quarter of 2026 whether automated or manual deduplication — or a hybrid — will govern the Open City Archive cleanup. Second, the Istanbul Archaeological Museums, operating under the national Ministry of Culture and Tourism rather than the municipality, must agree on a shared metadata standard before any cross-institutional deduplication can proceed; without that agreement, cleaning up one database risks creating new inconsistencies where records from the two systems overlap. Third, and most consequentially, the municipality must establish a clear retention policy: when a duplicate is flagged, who has the authority to delete it permanently, and is that deletion irreversible?
Researchers and archivists working with collections at the Atatürk Library in Taksim — one of the city's most heavily used public image repositories — have long argued that deletion decisions should require sign-off from at least two independent specialists, not just a database administrator acting on an algorithm's recommendation. That procedural safeguard does not currently exist in the Open City Archive's published governance documents.
The practical stakes for ordinary users are not abstract. Students at Istanbul University's history faculty, journalists researching the transformation of the Bosphorus shoreline, and heritage organisations tracking changes to listed buildings in Balat and Fener all depend on the archive's reliability. Getting the deduplication framework right — before the cleanup begins at scale — is the assignment. Getting it wrong risks permanently degrading a record of one of the world's most photographed cities.