Blog
Why Clinical AI Doesn’t Care About Your Hochdeutsch
Clinicians in Switzerland move constantly between German, English, dialect, and French. Here’s what actually matters when prompting AI safely.
Clinicians in Switzerland bounce between chart notes in German, literature in English, patient messages in dialect, and maybe handovers in French, sometimes in the same shift. So does the language you prompt in actually change accuracy and safety?
The practical answer: use a hybrid workflow
Input: your clinical working language (DE/FR/IT)
Process instruction: “Analyze using international medical literature and standard medical terminology.”
Output: the language you need for documentation and handoffs
Why this works: it separates reasoning from documentation clarity and keeps your prompts auditable.
Example: 03:00 admission note in Swiss German → prompt “Summarize using standard medical terminology” → structured output in German for your clinical information system.
Four key facts
1) English is the default knowledge substrate
As of December 2023, 86.5% of PubMed-indexed publications were in English.
That matters because training and evaluation data often reflect what is most available.
2) “English is always better” is not a fact
Multilingual performance varies by model and task, so claims should be benchmark-based, not anecdotal. Medical multilingual benchmarks like MedExpQA exist specifically to measure cross-language differences.
Takeaway: English is often a reasonable default for complex reasoning, but it is not a safety guarantee.
3) The bigger risk is messy input
What reliably worsens outputs is not “German vs English”, it is ambiguity and noise:
unstructured histories (e.g. “pt unwell since days, maybe fever?, meds??, unclear past medical history”)
inconsistent terminology (e.g. mixing “Luftnot”, “dyspnea”, “can’t breathe” without onset or severity; “renal failure” without stage or creatinine)
copy-pasted patient messages (e.g. long chat text: “I feel weird… heart skipping… since yesterday… btw could I be pregnant??”)
mixed languages in one prompt (e.g. “Anamnese: seit 3 Tagen Fieber. Triage: douleur thoracique 7/10. DDx: PE vs pneumonia. Angehörige: ‘sie isch hüt mega komisch gsi’.”)
The Swiss trap: “copy-paste soup” (note fragments, patient messages, and dialect) is not a prompting strategy.
Rule: normalize, structure, and use clinical terms before asking for reasoning.
4) Swiss-specific tools are emerging, with caveats
AlpineAI’s SwissGPT: AlpineAI states SwissGPT can be used in all Swiss national languages and in English. But language support is not the same as clinically validated performance.
Apertus (EPFL/ETH/CSCS, Swiss AI Initiative): Apertus is presented as a large-scale open, multilingual Swiss model trained on 15 trillion tokens across 1,000+ languages, including Swiss German and Romansh. At the same time, Apertus is primarily a general-purpose LLM, and its medical usefulness depends on domain adaptation, clinical evaluation, and appropriate governance.
Bottom line
Routine work: local language is fine.
Complex reasoning: English is often reasonable, not guaranteed better.
Always required: clinical judgment, verification, and missing-data checks.
LLM outputs are decision support, not decisions.
Sources
Hamad AA. Medical research production in native languages: descriptive analysis of PubMed. Qatar Med J. 2024. PubMed: https://pubmed.ncbi.nlm.nih.gov/38746849/
Alonso I, Oronoz M, Agerri R. MedExpQA: Multilingual benchmarking of LLMs for medical QA. 2024. arXiv: https://arxiv.org/abs/2404.05590; journal: https://www.sciencedirect.com/science/article/pii/S0933365724001805
AlpineAI. SwissGPT language support (Safety page). Accessed 2026-01-19. https://alpineai.swiss/en/safety/
Swiss AI Initiative. Apertus (official project page). Accessed 2026-01-19. https://www.swiss-ai.org/apertus
ETH Zurich. Apertus press release. 2025-09-02. https://ethz.ch/en/news-and-events/eth-news/news/2025/09/press-release-apertus-a-fully-open-transparent-multilingual-language-model.html