Blog

Why Clinical AI Doesn’t Care About Your Hochdeutsch

20 January 2026 By SwissMed AI
Why Clinical AI Doesn’t Care About Your Hochdeutsch

Clinicians in Switzerland move constantly between German, English, dialect, and French. Here’s what actually matters when prompting AI safely.

Clinicians in Switzerland bounce between chart notes in German, literature in English, patient messages in dialect, and maybe handovers in French, sometimes in the same shift. So does the language you prompt in actually change accuracy and safety?

The practical answer: use a hybrid workflow

  • Input: your clinical working language (DE/FR/IT)

  • Process instruction: “Analyze using international medical literature and standard medical terminology.”

  • Output: the language you need for documentation and handoffs

Why this works: it separates reasoning from documentation clarity and keeps your prompts auditable.

Example: 03:00 admission note in Swiss German → prompt “Summarize using standard medical terminology” → structured output in German for your clinical information system.


Four key facts

1) English is the default knowledge substrate

As of December 2023, 86.5% of PubMed-indexed publications were in English.
That matters because training and evaluation data often reflect what is most available.

2) “English is always better” is not a fact

Multilingual performance varies by model and task, so claims should be benchmark-based, not anecdotal. Medical multilingual benchmarks like MedExpQA exist specifically to measure cross-language differences.
Takeaway: English is often a reasonable default for complex reasoning, but it is not a safety guarantee.

3) The bigger risk is messy input

What reliably worsens outputs is not “German vs English”, it is ambiguity and noise:

  • unstructured histories (e.g. “pt unwell since days, maybe fever?, meds??, unclear past medical history”)

  • inconsistent terminology (e.g. mixing “Luftnot”, “dyspnea”, “can’t breathe” without onset or severity; “renal failure” without stage or creatinine)

  • copy-pasted patient messages (e.g. long chat text: “I feel weird… heart skipping… since yesterday… btw could I be pregnant??”)

  • mixed languages in one prompt (e.g. “Anamnese: seit 3 Tagen Fieber. Triage: douleur thoracique 7/10. DDx: PE vs pneumonia. Angehörige: ‘sie isch hüt mega komisch gsi’.”)

The Swiss trap: “copy-paste soup” (note fragments, patient messages, and dialect) is not a prompting strategy.
Rule: normalize, structure, and use clinical terms before asking for reasoning.

4) Swiss-specific tools are emerging, with caveats

  • AlpineAI’s SwissGPT: AlpineAI states SwissGPT can be used in all Swiss national languages and in English. But language support is not the same as clinically validated performance.

  • Apertus (EPFL/ETH/CSCS, Swiss AI Initiative): Apertus is presented as a large-scale open, multilingual Swiss model trained on 15 trillion tokens across 1,000+ languages, including Swiss German and Romansh. At the same time, Apertus is primarily a general-purpose LLM, and its medical usefulness depends on domain adaptation, clinical evaluation, and appropriate governance.


Bottom line

  • Routine work: local language is fine.

  • Complex reasoning: English is often reasonable, not guaranteed better.

  • Always required: clinical judgment, verification, and missing-data checks.

LLM outputs are decision support, not decisions.


Sources