Skip to content

Textile Museum St.Gallen: Testing AI-supported metadata extraction on historic fashion photographs

In collaboration with Luba Nurse and Liliane Vogt, Textile Museum St.Gallen

The Textile Museum St.Gallen worked with Archipanion on a pilot to test whether written information on historic fashion photographs could be turned into structured, searchable metadata.

header_800w

Summary

The Textile Museum St.Gallen (founded in 1878) holds an extensive collection of historic hand and machine embroidery, lace, fabrics, costumes, pattern books, design drawings, and photographs. Among the collection is a body of fashion photographs from the 1920s to the 1990s, documenting East Swiss machine embroidery produced for the international fashion market.

In a pilot project, Archipanion worked with the museum on a representative set of 100 digitised fashion photographs to test whether AI-supported metadata extraction could turn written information on the front and back of each photograph into structured, searchable data. The photographs contained multi-lingual typed, stamped, and handwritten notes such as captions, dates, names, geographic references, and reference numbers: information that was visible in the images but not yet searchable in the museum’s collection database.

The pilot produced a structured spreadsheet of extracted data for each photograph. If integrated into the museum's collection records, this kind of structured output would make it possible to search across previously inaccessible information — names, locations, dates, fabrics, photographer credits — without inspecting each photograph manually.

The Challenge

Many of the museum’s 20th-century fashion photographs contain valuable metadata written or printed directly on the photograph. For example, the reverse sides often included information about the photographer, the model, the clothing and the fabric shown, the fashion brand name, the location of the fashion show and the year the photograph was taken.

However, this information is neither in the museum’s database nor searchable, and therefore not accessible for researchers.

Making this information searchable was not straightforward. The text appeared in different forms — typed, stamped, and handwritten — and across the three languages of the St.Gallen textile industry: German, French, and English. To address this, the museum engaged Archipanion to pilot AI-supported metadata extraction: in practice, machine learning models that read text in images and convert it into structured data fields, with human review at each stage.

"On a fashion photograph, the recto and verso form a single epistemic unit. The image on the front and the stamps, captions, and reference numbers on the back together constitute the object — but in conventional cataloguing, that back side tends to be summarised rather than recorded in full. The information is visible, but not searchable, and over time it becomes effectively dissociated from the object it belongs to. What interested us in this pilot was less the question of whether AI can transcribe handwriting, and more whether it could help us hold both sides of the photograph together as a single record. On 100 photographs, that question is now open in a usefully concrete way."

Luba Nurse
Head of Collection and Library, Textile Museum St.Gallen

Picture 1

Image: Front and back views of two historic fashion photographs from the Textile Museum St.Gallen pilot (TMSG TMF 27.3.19, TMSG TMF 68.1.4). The examples show the kinds of written and stamped information Archipanion extracted, including captions, names, reference details, and other contextual metadata.

The Pilot

The pilot test set comprised 100 historic fashion photographs. Each photograph was represented by two image files: one showing the front and one showing the back. The 100 photographs were selected across 80 years as a representative set to evaluate the extraction approach before any larger-scale project was considered.

The extraction process focused on a defined set of metadata fields. The aim was for each photograph to produce one row in a spreadsheet, with columns capturing information such as: the main written content extracted from the photograph; any written date or date-related information; names of people or organisations mentioned; information from stamps, logos, or similar visual markers; and place names or other geographic references.

Many of these fields were multi-valued - for example, where a photograph listed several names. Archipanion therefore split these fields into multiple numbered subcolumns as needed. The goal was to produce a structured CSV or Excel file in which each field appeared in its own column, ready for import into the museum’s database.

Workflow and Deliverables

The extraction work followed an iterative process with three main stages:

  1. Initial extraction test: Archipanion first ran the AI extraction process on a smaller sample of around 20 photographs, selected at random from the full pilot set of 100. This initial output was used for close review and comparison before the extraction process was applied across the full pilot set.

  2. Comparison and reference review: the museum’s documentation specialists reviewed and corrected the extracted information for this 20-photograph test set, creating a checked reference set. This gave Archipanion a reliable basis for comparing the AI output against an accurate version of the data, and for improving the extraction process before moving on to the full pilot set.

  3. Final export spreadsheet: Based on the lessons learned in step two, Archipanion refined the extraction process, re-ran it across the full pilot set, and prepared the final deliverable: a clean export spreadsheet for all 100 photographs.

The multilingual content — captions, stamps, and annotations across German, French, and English — was extracted and, where needed, translated into German for the museum’s records.

"The most informative part of the pilot for us was working through the reference set of twenty photographs in detail. Correcting the extracted data field by field showed where the categories we had defined in advance held up, where they needed to be split or refined, and how the multilingual content — across the French, German, and English that the St.Gallen textile industry has worked in for over a century — would actually behave once structured. That close review also raised questions we still need to answer before any larger rollout: how the extracted data would map onto our collection management system, and what preparation the source material would need. The iterative process Archipanion built around that review — extracting on a smaller sample first, comparing against our checked reference set, then refining — is what made a pilot of 100 photographs the right scale to ask those questions seriously."

Liliane Vogt
Documentation Specialist, Textile Museum St.Gallen

Results and Practical Outcomes

The pilot produced a structured dataset for the 100-photograph sample, demonstrating that the iterative extraction method — AI extraction reviewed against a corrected reference set — could handle the visual and linguistic complexity of these photographs at this scale. Whether the same approach scales to thousands of photographs, and how it integrates with the museum's collection management system, remain open questions for any future phase.

Beyond the structured dataset itself, the pilot pointed to several further benefits worth exploring at scale.

Improved legibility. Where verso inscriptions are in faded ink or pencil, digitisation combined with metadata extraction can make text legible that would be difficult or impossible to read on the physical photograph alone.

Linking and analysis. Structured metadata across a body of photographs would make it possible to query and analyse the collection in ways that are not feasible photograph-by-photograph — opening new research questions about the East Swiss machine embroidery industry, its clients, photographers, and international circulation.

Reduced handling and preventive conservation. A digital copy paired with structured metadata acts as a surrogate for the physical object. Researchers can answer most initial questions from the surrogate, with the photograph itself retrieved only when there is a specific reason that the digital record cannot answer. For unmounted historic photographs, which are vulnerable to surface abrasion, edge damage, and cumulative stress from handling, this is a meaningful preservation benefit. It also addresses dissociation in two senses at once — the physical separation that happens when objects are handled repeatedly, and the informational separation between an object and the knowledge inscribed on its verso.

Lessons Learned

Several lessons emerged from the pilot that would shape any future rollout. First, the iterative method — extracting on a small sample, correcting field by field against a reference set, then refining — proved essential. The categories defined in advance did not all hold up under close review; some needed to be split, others refined, and the multilingual content (particularly French and English alongside German records) revealed structural questions that only became visible during correction.

Second, scaling from a pilot of 100 photographs to processing thousands would require early coordination with the museum's IT, collections, and workflow systems, so that extracted data can be imported and used effectively within the collection management system. This is not a technical question about extraction accuracy; it is a question about how the structured output meets the museum's existing infrastructure.

Third, source material itself needs preparation. The pilot worked with photographs that had been scanned consistently, with both recto and verso captured. Any larger project would need to assess a comparable state of digitisation across the wider photograph holdings before extraction could begin at scale.

What this means for museums, archives, and heritage collections

This case study is relevant beyond one set of fashion photographs. Many museums, archives, and heritage organisations hold collections where useful information is visible but not searchable across the collection as a whole. In some cases, that information is embedded in visually complex material such as annotated photographs, albums, design materials, or sample books. In others, collections may be more text-heavy and structurally straightforward, such as index cards, registers, and ledgers, where information appears in more standard formats. The Textile Museum St.Gallen pilot sits towards the more complex end of this spectrum. It shows that even when information is embedded in layered and visually complex material, it can still be extracted.

About Textile Museum St.Gallen

The Textile Museum St.Gallen, founded in 1878, holds an extensive collection of historic hand and machine embroidery, lace, fabrics, costumes, pattern books, design drawings, and photographs. A selection of the museum's fashion photography is the subject of the forthcoming exhibition Mise en Scène. Fashion photography from the Belle Époque to the present day (3 July 2026 – 28 February 2027).

www.textilmuseum.ch/en/fashion-photography

This pilot project was supported by E. Fritz und Yvonne Hoffmann-Stiftung.

About Archipanion

If your institution is working with collections where important information is visible but not searchable across the collection, Archipanion can help assess where metadata extraction can add value. Museums, archives, and heritage organisations interested in exploring this kind of work are invited to contact Archipanion to discuss their collections.

Ready to explore AI for your collections?