Istanbul's municipal digital archives contain tens of thousands of duplicate photographs — some images catalogued three or four times under different file names — according to a review of documents submitted to the Istanbul Metropolitan Municipality's Information Technologies Directorate in early 2026. The duplication problem, which affects records stretching back to the early 1990s, has quietly undermined public access to one of the most visually documented cities in the world.
The timing matters. The Istanbul Metropolitan Municipality, led by CHP Mayor Ekrem İmamoğlu, is midway through a multi-year digitisation push that explicitly depends on the integrity of legacy data. If the base archives are bloated with redundant images, the cost of storage, retrieval, and AI-assisted tagging balloons with them. The problem is not new, but pressure to resolve it has sharpened as the municipality moves core services online and as heritage preservation groups grow louder about the condition of records tied to neighbourhoods like Balat, Fener, and Süleymaniye.
Three Digitisation Waves, Three Different Standards
The duplication crisis has a traceable genealogy. Istanbul went through three distinct waves of institutional digitisation. The first, roughly 1993 to 2001, was driven by the Istanbul Metropolitan Municipality's then-new computing units, which scanned print photographs at inconsistent resolutions with no shared naming convention. The second wave, from around 2005 to 2012, overlapped with Turkey's EU accession-era governance reforms and brought in external contractors who often rescanned material already in the system without cross-referencing existing entries. The third wave began after 2019, when İmamoğlu's administration launched a transparency-focused open-data programme and began migrating archives to a centralised platform.
Each transition compounded the earlier mess. Files migrated from legacy servers at the Fatih district offices and the Beyoğlu municipality annexe — both of which ran separate archiving systems before administrative consolidation — arrived at the central repository carrying their original duplicate entries intact. Staff at the İstanbul Şehir Üniversitesi library, which housed a parallel set of municipal photograph collections before the university's 2019 closure and asset transfer to the state, flagged the problem in internal correspondence at the time, though no systematic deduplication followed.
The Scale of the Problem and Current Remediation Efforts
Estimates from the municipality's own technical documentation, referenced in a February 2026 procurement notice for image-processing software, suggest the central archive holds roughly 1.4 million photograph files, of which a significant portion are suspected duplicates. The procurement notice did not specify a percentage, but it cited the need for tools capable of processing hash-based and perceptual-similarity comparisons across the full collection — standard language for large-scale deduplication projects.
The SALT Galata research centre on Bankalar Caddesi in Karaköy, which maintains its own independent photographic archive of Istanbul, has grappled with similar issues at a smaller scale and resolved much of its duplicate problem through a phased manual and automated review completed between 2021 and 2023. That experience has made SALT a reference point for municipal archivists now working through the larger problem. The Istanbul Research Institute in Beyoğlu, which holds collections donated by private families and institutions, began a comparable deduplication project in 2024 using open-source perceptual hashing tools.
The practical consequences for ordinary users are real. A researcher trying to locate photographs of the Galata Bridge demolition in 1992, or images documenting the Avcılar district after the 1999 Marmara earthquake, may retrieve the same image file dozens of times under different identifiers before locating distinct visual records. Storage costs also accumulate: cloud infrastructure contracts for large public archives in comparable European cities suggest that redundant files can account for 20 to 35 percent of total storage expenditure before deduplication.
The municipality's current remediation timeline, as outlined in the February procurement notice, targets completion of a first-pass automated deduplication by the end of 2026, with a human-review phase for ambiguous matches running into the first quarter of 2027. Researchers and heritage groups working with Istanbul records should expect the archive search interface to undergo intermittent updates during that period — and, officials have indicated in public documentation, improved retrieval accuracy once the process is complete.