Istanbul's municipal digitisation drive has hit a bureaucratic wall. The Istanbul Metropolitan Municipality's digital archive, which has been cataloguing tens of thousands of historical photographs spanning Ottoman-era Eminönü to modern Kadıköy, is carrying a significant volume of duplicate image files that is degrading search performance and blocking public researchers from efficiently accessing records. The problem is not trivial: archive managers working within the IBB's (İstanbul Büyükşehir Belediyesi) cultural documentation unit have been wrestling for months with how to identify, flag and ultimately replace or retire redundant image entries without destroying irreplaceable originals.
The issue has sharpened in 2026 because the IBB's broader open-data commitments, pledged as part of the municipality's transparency agenda under Mayor Ekrem İmamoğlu's administration, set a deadline of the end of this year for a publicly searchable visual heritage portal. Duplicate entries undermine that goal directly. If two or three versions of the same Galata Tower photograph appear under different metadata tags, a researcher querying the system for unique Beyoğlu streetscapes from the 1940s risks drowning in noise rather than finding signal. The decision on how to handle duplicates is therefore not a back-office IT question — it is a policy question about what the city's memory looks like.
What the Options Actually Are
Archive professionals broadly face three paths when confronting duplicate image inventories. The first is automated hash-matching, where software compares pixel-level fingerprints and flags identical files for human review. This works well for exact copies but fails on near-duplicates — a slightly cropped scan of the same Süleymaniye Mosque photograph, for example, will not register as a match. The second approach uses perceptual hashing algorithms, which are more tolerant of minor alterations, and several European municipal archives including those in Amsterdam and Vienna have adopted this method since 2022. The third, most labour-intensive route is manual curation by specialist archivists who can make editorial judgements about which version of an image carries superior resolution, provenance notes or contextual metadata.
Istanbul's archive currently holds digitised material drawn from at least two major institutional collections: the İstanbul Araştırmaları Enstitüsü (Istanbul Research Institute) on Meşrutiyet Caddesi in Beyoğlu, and the Atatürk Kitaplığı on Mimar Kemalettin Caddesi in Taksim. Both institutions have been feeding material into the IBB's central repository at different times and under different metadata standards, which is precisely how duplication compounds. A photograph donated to one collection in 2019 under one tagging convention can reappear as a separate entry when a second batch arrives from the other institution in 2024.
The Decisions Coming in the Next Six Months
Three specific choices are now sitting with IBB's digital infrastructure directorate. First, whether to procure a commercial deduplication platform — licence costs for systems used by comparable European city archives run roughly between €40,000 and €120,000 annually — or to invest in open-source tooling that requires more in-house technical capacity. Second, whether to establish a joint editorial committee with the İstanbul Araştırmaları Enstitüsü and the Atatürk Kitaplığı to agree on a single metadata standard before any more material is ingested. Without that agreement, new duplicates will appear faster than old ones can be resolved. Third, and most consequentially, the municipality must decide what the retention policy is for flagged duplicates: permanent deletion, quarantine in a separate non-public layer, or active replacement with a higher-quality master file.
Researchers and heritage groups who rely on the archive say the last option — replacement rather than deletion — is the approach most consistent with archival ethics, since provenance chains can be disrupted by outright removal. The Istanbul-based digital heritage consultancy Arşiv Kolektif, which has worked with both the IBB and the Boğaziçi University library system, has publicly advocated for a quarantine-first model in papers presented at the 2025 Ankara digital archives symposium.
The IBB's open-data portal deadline of December 31, 2026 is now less than six months away. If the deduplication policy is not settled and implemented by September at the latest, the portal risks launching with a cluttered, unreliable image database — the opposite of the transparent civic resource the municipality has promised residents from Fatih to Üsküdar.