This is a cool article.
But if they want LLMs to use fewer em dashes, why not find and replace with a comma or semicolon using a regex that matches known patterns so as to reduce it's frequency in the training data?
A community to discuss and share information about typography and fonts
Sibling community:
Rules of conduct:
The usual ones on Lemmy and Mastodon. In short: be kind or at least respectful, no offensive language, no harassment, no spam.
(Icon: detail from the title of Bringhurst's Elements of Typographic Style. Banner: details from pages 6 and 12, ibid.)
This is a cool article.
But if they want LLMs to use fewer em dashes, why not find and replace with a comma or semicolon using a regex that matches known patterns so as to reduce it's frequency in the training data?
If they could use use regex they wouldn't be using an LLM.
they could just put it in the system prompt or so.
It apparently doesn't work, from the article:
It’s also surprisingly hard to prompt models to avoid em-dashes: take this thread from the OpenAI forums where users share their unsuccessful attempts.