Istanbul's public digital infrastructure has a clutter problem. Across the Istanbul Metropolitan Municipality's open-data portals, the Istanbul Archaeological Museums' digitisation project, and the cultural heritage databases maintained under the Culture Ministry's regional directorate on Alemdar Caddesi, tens of thousands of duplicate image files have accumulated — the residue of multiple overlapping scanning campaigns, tourism board uploads, and volunteer digitisation drives over the past decade. Archivists and IT managers across the city are now confronting what to do about it.
The issue has grown urgent in 2026 for a practical reason: storage costs. Municipal cloud contracts in Turkey, denominated in US dollars to hedge against lira volatility, have climbed sharply as the lira has remained under sustained pressure. Maintaining redundant image copies is no longer a bureaucratic inconvenience — it is a measurable budget line. The Istanbul Metropolitan Municipality's technology directorate has flagged digital storage rationalisation as a priority task in its 2026 operational calendar, according to publicly available municipal procurement notices posted this spring.
What Other Cities Have Done
Amsterdam set an early benchmark. The Rijksmuseum's Rijksstudio platform, which went fully open-access in 2013 and has grown to host more than one million object images, invested heavily in perceptual hashing — a technique that assigns a unique fingerprint to each image based on visual content rather than filename — to detect and eliminate near-duplicate scans before they enter the public catalogue. By 2023 the museum's digital team reported that automated deduplication had reduced its working image repository size by roughly 18 percent, freeing significant server space and improving search result quality.
Seoul took a different approach. The National Folk Museum of Korea, located in Gyeongbokgung Palace, integrated AI-assisted duplicate detection directly into its ingest pipeline, meaning files are screened at the point of upload rather than cleaned up retrospectively. The system flags images with more than 85 percent visual similarity for human review before they are committed to the archive. Seoul's model has attracted attention from peer institutions across East Asia precisely because it prevents backlog formation rather than treating it after the fact.
Istanbul's situation is more fragmented than either city's. Amsterdam's deduplication effort was centralised within a single institution. Seoul's was built into a greenfield digital system. Istanbul, by contrast, has dozens of institutions — the Topkapı Palace Museum directorate in Sultanahmet, the Istanbul Modern art museum that reopened in Galata in 2023, the Atatürk Library on Millet Caddesi in Beyazıt — each managing their own image repositories with little cross-institutional coordination. When the same Ottoman miniature or Bosphorus panorama is scanned independently by three separate bodies, nobody is automatically alerted.
Local Efforts Taking Shape
There are early signs of coordination. The SALT research institution, which maintains archives across its Beyoğlu and Galata branches, has piloted a shared metadata schema with at least two partner organisations since late 2024, specifically designed to flag images whose descriptive tags suggest likely duplication before a full visual comparison is run. The approach is manual-heavy but workable at SALT's scale.
The larger municipal challenge belongs to the Istanbul Metropolitan Municipality's BELBIM subsidiary, which manages the city's broader technology infrastructure. BELBIM has issued a tender — visible on Turkey's public procurement platform ekap.kik.gov.tr — for a digital asset management system update, with duplicate detection listed among the functional requirements. The tender was posted in May 2026 with a submission deadline in late June; a contract award is expected by September.
For cultural institutions still working through the problem without a municipal solution, archivists recommend a straightforward first step: auditing filenames and upload dates to identify batches from known duplicate campaigns — typically the 2015–2017 wave of EU co-funded digitisation projects and the 2020–2021 pandemic-era volunteer scanning drives — before committing to any automated tool. That alone, according to documented case studies from institutions in London and Prague, can surface 30 to 40 percent of the duplicate volume without specialist software. Istanbul's archivists are not waiting for the perfect system. They are starting with the known problem files and working forward from there.