Question: How to delete time tags in a txt document,or only read english words.

00:00:10,980 --> 00:00:12,780
it will not have gone though both slits.
@#$%#Q%#&(*^#(*^&@)

This can be obtained from a .srt caption,and opened with notepad,and saved as .txt document.

 

Sample srt file from http://en.wikipedia.org/wiki/SubRip

=====

1

00:00:20,000 --> 00:00:24,400

In connection with a dramatic increase

in crime in certain neighbourhoods,

 

2

00:00:24,600 --> 00:00:27,800

the government is implementing a new policy...

=====

Sample Mathematica to delete time tags

=====

s=OpenRead["C:\\Documents and Settings\\Don Taylor\\Desktop\\sample.srt"]; While[(w=Read[s,Record])=!=EndOfFile,

  (* Delete timestamp lines *)

  If[StringCases[w,RegularExpression["[0-9][0-9]:[0-9][0-9]:[0-9][0-9].*"]]=={w},

      (*Print["Time Stamp Discarded ",w];*)w="",

      Print[w];

    ];

 ]

Close[s];



 

 

Please Wait...