(A much-simplified version of what I posted earlier.)
List the positions of all substrings of the given string that match a chemical element symbol.
uses ST= StringTools, ST_PD= StringTools:-PatternDictionary, SC= ScientificConstants;
#PatternDictionary is case sensitive. Thus, all text is uppercased for searching.
#Periodic (remember) Table of Elements:
#Map from uppercase elem symbs to standard capitalized symbs.
PTE:= proc(EL::string) option remember; ST:-Capitalize(EL) end proc,
#Dictionary of uppercased elem symbs.
Dict:= ST_PD:-Create(ST:-UpperCase ~ ([SC:-GetElements()])),
#ST_PD:-Search returns seq of 2-member lists, with 2nd member
#being the dict. id# of found pattern. RetStr is operator that turns that id#
#into standard elem symbol.
RetStr:= curry(applyop, curry(PTE@ST_PD:-Get, Dict), 2)
export ModuleApply:= (S::string)-> RetStr ~ ([ST_PD:-Search](Dict, ST:-UpperCase(S)));
BreakBad("The Cat Molly");
[[2, "H"], [1, "Th"], [2, "He"], [5, "C"], [5, "Ca"], [6, "At"], [10, "O"], [9, "Mo"], [13, "Y"]]
So this says that "H" occurs at position 2, "At" at position 6, etc.
For next step, we need to decide what to highlight because there are overlapping choices. For the simple example string above we have 9 matches; it would be quite a mess to highlight them all. Perhaps we should select at random. For the final display, I'm considering doing it as a textplot. That way highlights can be done with color easily under programmatic control